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SET THEORY 


f. Sets and Functions 


1.1. Basic definitions. Mathematics habitually deals with “sets” made up 
of “‘elements’’ of various kinds, e.g., the set of faces of a polyhedron, the 
set of points on a line, the set of all positive integers, and so on. Because of 
their generality, it is hard to define these concepts in a way that does more 
than merely replace the word “‘set’’ by some equivalent term like “‘class,”’ 
““family,’’ “collection,’’ etc. and the word “element”? by some equivalent 
term like ““member.’’ We will adopt a “‘naive’’ point of view and regard the 
notions of a set and the elements of a set aS primitive and well-understood. 

The set concept plays a key role in modern mathematics. This is partly 
due to the fact that set theory, originally developed towards the end of the 
nineteenth century, has by now become an extensive subject in its own right. 
More important, however, is the great influence which set theory has exerted 
and continues to exert on mathematical thought as a whole. In this chapter, 
we introduce the basic set-theoretic notions and notation to be used in the 
rest of the book. 

Sets will be denoted by capital letters like A, B,..., and elements of 
sets by small letters like a, b,.... The set with elements a, b, c,... is often 
denoted by {a,b,c,...}, ie., by writing the elements of the set between 
curly brackets. For example, {1} is the set whose only member is 1, while 
{1,2,...,m,...} is the set of all positive integers. The statement “the 
element a belongs to the set A’’ is written symbolically as ae A, while 
a¢A means that “the element a does not belong to the set A.’’ If every 
element of a set A also belongs to a set B, we say that A is a subset of the 
set B and write A < B or B> A (also read as “A is contained in B”’ or 
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“B contains A’’). For example, the set of all even numbers is a subset of the 
set of all real numbers. We say that two sets A and B are equal and write 
A = B if A and B consist of precisely the same elements. Note that 4 = B 
if and only if A © Band BC 4A, iie., if and only if every element of A is an 
element of B and every element of Bis an element of A. If A <— Bbut A FB, 
we Call A a proper subset of B. 

Sometimes it is not known in advance whether or not a certain set (for 
example, the set of roots of a given equation) contains any elements at all. 
Thus it is convenient to introduce the concept of the empty set, 1.e., the set 
containing no elements at all. This set will be denoted by the symbol ©. 
The set @ is clearly a subset of every set (why ?). 


A B 


FiGure 1 FIGURE 2 


1.2. Operations on sets. Let A and B be any two sets. Then by the sum 
or union of A and B, denoted by A U B, is meant the set consisting of all 
elements which belong to at least one of the sets A and B (see Figure 1). 
More generally, by the sum or union of an arbitrary number (finite or in- 
finite) of sets A, (indexed by some parameter «), we mean the set, denoted by 


U A,, 


of all elements belonging to at least one of the sets A,. 

By the intersection A © B of two given sets A and B, we mean the set 
consisting of all elements which belong to both 4 and B (see Figure 2). For 
example, the intersection of the set of all even numbers and the set of all 
integers divisible by 3 is the set of all integers divisible by 6. By the inter- 
section of an arbitrary number (finite or infinite) of sets A,, we mean the 
set, denoted by 

0 A. 


of all elements belonging to every one of the sets 4,. Two sets A and B are 
said to be disjoint if A 0 B = @,1.e., if they have no elements in common. 
More generally, let ¥ be a family of sets such that A 1 B= @ for every 
pair of sets A, Bin #. Then the sets in ¥ are said to be pairwise disjoint. 
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It is an immediate consequence of the above definitions that the operations 
U and are commutative and associative, i.e., that 
AUB=BUA, (AUB)UC=AU(BUOC), 
ANB=BQOA, (ANB) NC=AN(BNC). 
Moreover, the operations U and (4 obey the following distributive laws: 
(AUB NC=(ANOQU(BNY), (1) 
(ANBUC=(AUC)N(BUC). (2) 
For example, suppose x € (A U B) NC, so that x belongs to the left-hand 


A-B8B 


A B 


FIGURE 3 FIGURE 4 


side of (1). Then x belongs to both C and A UB, ie., x belongs to both 
C and at least one of the sets A and B. But then x belongs to at least one of 
the sets A 1 Cand BON C,ie., xE€(A NOC) U (BOC), so that x belongs 
to the right-hand side of (1). Conversely, suppose x € (A NC) U (BOC). 
Then x belongs to at least one of the two sets 4d 1 C and BO C. It follows 
that x belongs to both C and at least one of the two sets A and B, ie., x EC 
and x € A U Bor equivalently x € (A U B) OC. This proves (1), and (2) is 
proved similarly. 

By the difference A — B between two sets A and B (in that order), we 
mean the set of all elements of A which do not belong to B (see Figure 3). 
Note that it is not assumed that A > B. It 1s sometimes convenient (e.g., in 
measure theory) to consider the symmetric difference of two sets A and B, 
denoted by A A B and defined as the union of the two differences A — B 
and B — A (see Figure 4): 


AAB=(A—B)U(B—A). 


We will often be concerned later with various sets which are all subsets 
of some underlying basic set R, for example, various sets of points on the 
real line. In this case, given a set A, the difference R — A is called the 
complement of A, denoted by CA. 
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An important role is played in set theory and its applications by the 
following “duality principle”; 


R— UA, = N(R — A), (3) 
R— fA, = U(R- A,). (4) 


In words, the complement of a union equals the intersection of the comple- 
ments, and the complement of an intersection equals the union of the 
complements. According to the duality principle, any theorem involving a 
family of subsets of a fixed set R can be converted automatically into another, 
“dual’’ theorem by replacing all subsets by their complements, all unions 
by intersections and all intersections by unions. To prove (3), suppose 


xE€R— UA, (5) 
Then x does not belong to the union 
U A,, (6) 


i.e., x does not belong to any of the sets 4,. It follows that x belongs to each 
of the complements R — A,, and hence 


x€ n (R — A,). (7) 


Conversely, suppose (7) holds, so that x belongs to every set R — A,. Then 
x does not belong to any of the sets A,, i.e., x does not belong to the union 
(6), or equivalently (5) holds. This proves (3), and (4) is proved similarly 
(give the details). 


Remark. The designation “symmetric difference’ for the set A A B is 
not too apt, since A A B has much in common with the sum A U B. In fact, 
in A U B the two statements “x belongs to A’’ and “x belongs to B”’ are 
joined by the conjunction “or’’ used in the “either... or...or both...” 
sense, while in A A B the same two statements are joined by “‘or’’ used in the 
ordinary “either ...or...’’ Sense (asin “‘to be or not to be’’). In other words, 
x belongs to A U Bif and only if x belongs to either A or B or both, while x 
belongs to A A Bif and only if x belongs to either A or B but not both. The 
set A A Bcan be regarded as a kind of “‘modulo-two sum”’ of the sets A and 
B, i.e., a sum of the sets 4 and B in which elements are dropped if they are 
counted twice (once in A and once in 8B). 


1.3. Functions and mappings. Images and preimages. A rule associating a 
unique real number y = f(x) with each element of a set of real numbers X 
is said to define a (real) function f on X. The set X is called the domain 
(of definition) of f, and the set Y of all numbers f(x) such that x € X is called 
the range of f. 
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More generally, let M and N be two arbitrary sets. Then a rule associating 
a unique element 6 = f(a) € N with each element a € M is again said to define 
a function f on M (or a function f with domain M). In this more general 
context, f is usually called a mapping of M into N. By the same token, f is 
said to map M into N (and a into DB). 

If a is an element of M, the corresponding element b = f(a) is called the 
image of a (under the mapping f). Every element of M with a given element 
be Nas its image is called a preimage of b. Note that in general b may have 
several preimages. Moreover, N may contain elements with no preimages 
at all. If 5 has a unique preimage, we denote this preimage by f—1(d). 

If A is a subset of M, the set of all elements f(a) € N such that ae A 
is called the image of A, denoted by f(A). The set of all elements of M whose 
images belong to a given set B © N is called the preimage of B, denoted 
by f-1(B). If no element of B has a preimage, then f—1(B) = @. A function 
fis said to map M into N if f(M) — N, as is always the case, and onto N 
if f(M) = N.1 Thus every “onto mapping”’ is an “into mapping,” but not 
conversely. 

Suppose fmaps M onto N. Then fis said to be one-to-one if each element 
be N has a unique preimage f—'(5). In this case, f is said to establish a 
one-to-one correspondence between M and N, and the mapping f—! associ- 
ating f—1(d) with each 6 € N is called the inverse of f- 


THEOREM |. The preimage of the union of two sets is the union of the 
preimages of the sets: 

f(A VU B) = f(A) UF). 

Proof. If x ef—(A U B), then f(x) € A U B, so that f(x) belongs 
to at least one of the sets A and B. But then x belongs to at least one of 
the sets f-1(A) and f—1(B), i.e., x Ef (A) Uf*(B). 

Conversely, if x ef~1(A) U f*(B), then x belongs to at least one 
of the sets f-1(A) and f-1(B). Therefore f(x) belongs to at least one of 
the sets A and B, i.e., f(x)€ A UB. But thenxe f(A UB). PP 


THEOREM 2. The preimage of the intersection oj two sets is the inter- 

section of the preimages of the sets: 
f(A OB) = fA) OF“). 

Proof. If x ef—(A OB), then f(x) € A OB, so that f(x) eA and 
f(x) €B. But then x € f-!(A) and x ef *(B), ie., x Ef (A) Of (8). 

Conversely, if x € f(A) A f*(B), then x e f—1(A) and x € f-*(B). 
Therefore f(x)€A and f(x) EB, ie, f(x)EA OB. But then xe 
f(A OB). 


1 As in the case of real functions, the set f(M) is called the range of f 
* The symbol § stands for Q.E.D. and indicates the end of a proof. 
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THEOREM 3. The image of the union of two sets equals the union of the 
images of the sets: 


f(A U B) = f(A) Uf (B). 


Proof. Ify e f(A U B), then y = f(x) where x belongs to at least one 
of the sets A and B. Therefore y = f(x) belongs to at least one of the sets 
f(A) and f(B), ie., ye f(A) Uf(B). 

Conversely, if ye f(A) U f(B), then y = f(x) where x belongs to at 
least one of the sets A and B, 1e., xe A UB and hence y = f(x)é 
f(AUB). § 


Remark J. Surprisingly enough, the image of the intersection of two sets 
does not necessarily equal the intersection of the images of the sets. For 
example, suppose the mapping f projects the xy-plane onto the x-axis, 
carrying the point (x,y) into the (x,0). Then the segments 0< x < 1, 
y=Oand0 < x < 1, y = 1 do not intersect, although their images coincide. 


Remark 2. Theorems 1-3 continue to hold for unions and intersections 
of an arbitrary number (finite or infinite) of sets A4,: 


(UA) = UPd, 
f(0.4,) = OA) 


f(U A.) = U f( Aa). 


1.4, Decomposition of a set into classes. Equivalence relations. Decom 
positions of a given set into pairwise disjoint subsets play an important role 
in a great variety of problems. For example, the plane (regarded as a point 
Set) can be decomposed into lines parallel to the x-axis, three-dimensional 
space can be decomposed into concentric spheres, the inhabitants of a given 
city can be decomposed into different age groups, and so on. Any such 
representation of a given set M as the union of a family of pairwise disjoint 
subsets of M is called a decomposition or partition of M into classes. 

A decomposition is usually made on the basis of some criterion, allowing 
us to assign the elements of M to one class or another. For example, the 
set of all triangles in the plane can be decomposed into classes of congruent 
triangles or into classes of triangles of equal area, the set of all functions 
of x can be decomposed into classes of functions all taking the same value at 
a given point x, and so on. Despite the great variety of such criteria, they 
are not completely arbitrary. For example, it is obviously impossible to 
partition all real numbers into classes by assigning the number 6 to the same 
class as the number a if and only if 6 > a. In fact, if 6 > a, 6 must be 
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assigned to the same class as a, but then a cannot be assigned to the same 
class as b, since a < b. Moreover, since a is not greater than itself, a cannot 
even be assigned to the class containing itself! As another example, it is 
impossible to partition the points of the plane into classes by assigning two 
points to the same class if and only if the distance between them is less than 1. 
In fact, if the distance between a and 3 is less than 1 and if the distance 
between 5 and c is less than 1, it does not follow that the distance between 
a and c is less than 1. Thus, by assigning a to the same class as 6 and 3 to 
the same class as c, we may well find that two points fall in the same class 
even though the distance between them is greater than 1! 

These examples suggest conditions which must be Satisfied by any criterion 
if it is to be used as the basis for partitioning a given set into classes. Let 
M be a set, and Jet certain ordered pairs (a, b) of elements of M be called 
“‘labelled.’’ If (a, b) is a labelled pair, we say that a is related to b by the 
(binary) relation R and write aRb.* For example, if a and d are real numbers, 
aRb might mean a < b, while if a and 6 are triangles, aRb might mean that 
a and 5 have the same area. A relation between elements of M is called 
a relation on M if there is at least one labelled pair (a, 5) for every ae M. 
A relation R on M is called an equivalence relation (on M) if it satisfies the 
following three conditions: 


1) Reflexivity: aRa for every ae M; 
2) Symmetry: If aRb, then bRa; 
3) Transitivity: If aRb and bRc, then aRc. 


THEOREM 4. A set M can be partitioned into classes by a relation R 
(acting as a criterion for assigning two elements to the same class) if and 
only if R is an equivalence relation on M. 


Proof. Every partition of M determines a binary relation on M, where 
aRb means that ‘“‘a belongs to the same class as 6.’ It is then obvious 
that R must be reflexive, symmetric and transitive, ie., that R is an 
equivalence relation on M. 

Conversely, let R be an equivalence relation on M, and let K, be the 
set of all elements x e M such that xRa (clearly ae K,, since R is 
reflexive). Then two classes K, and K, are either identical or disjoint. 
In fact, suppose an element c belongs to both K, and K,, so that cRa 
and cRb. Then aRc by the symmetry, and hence 


aRb (8) 


3 Put somewhat differently, let M* be the set of all ordered pairs (a, 6) with a, bE M, 
and let @ be the subset of M? consisting of all labelled pairs. Then aRb if and only if 
(a, b)€ &, i.e., a binary relation is essentially just a subset of M?. As an exercise, state 
the three Conditions for R to be an equivalence relation in terms of ordered pairs and the 
set &. 
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by the transitivity. Ifnow x € K,, then xRa and hence xRb by (8) and the 
transitivity, 1.e., x € K,. Virtually the same argument shows that x € K, 
implies x € K,. ThereforeK, = K, if K, and K, have an element in 
common. Therefore the distinct sets K, form a partition of M into 
classes. § 


Remark. Because of Theorem 4, one often talks about the decomposition 
of M into equivalence classes. 


There is an intimate connection between mappings and partitions into 
classes, as Shown by the following examples: 


Example 1. Let f be a mapping of a set A into a set B and partition A 
into sets, each consisting of all elements with the same image 6 = f(a) € B. 
This gives a partition of A into classes. For example, suppose f projects 
the xy-plane onto the x-axis, by mapping the point (x, y) into the point 
(x, 0). Then the preimages of the points of the x-axis are vertical lines, and 
the representation of the plane as the union of these lines is the decomposition 
into classes corresponding to /f,. 


Example 2. Given any partition of a set A into classes, let B be the set of 
these classes and associate each element a e€A with the class (i.e., element 
of B) to which it belongs. This gives a mapping of A into B. For example, 
suppose we partition three-dimensional space into classes by assigning to the 
same Class all points which are equidistant from the origin of coordinates. 
Then every class is a sphere of a certain radius. The set of all these classes 
can be identified with the set of points on the half-line [0, 00), each point 
corresponding to a possible value of the radius. In this sense, the decom- 
position of space into concentric spheres corresponds to the mapping of 
space into the half-line [0, oo). 


Example 3. Suppose we assign all real numbers with the same fractional 
part’ to the same class. Then the mapping corresponding to this partition 
has the effect of “winding’’ the real line onto a circle of unit circumference. 

Problem 1. Prove that if d UB=Aand 4A 01 B= A, then A = B. 

Problem 2. Show that in general (4 — B) UBF A. 

Problem 3. Let A = {2,4,...,2n,...} and B= {3,6,...,3n,...}. 
Find A 1 Band A — B. 


“ The largest integer <x is called the integral part of x, denoted by [x], and the quantity 
x — [x] is called the fractional part of x. 
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Problem 4. Prove that 
a) (A—B)NC=(ANC)—-(BNOC); 
b) AAB=(A UB)— (A NB). 
Problem 5, Prove that 
UA,— UB, UC, — B,). 


Problem 6. Let A,, be the set of all positive integers divisible by n. Find 
the sets 


a) UA,; b) fNA,. 
nN=2 n=2 
Problem 7. Find 
a) Ulat 7,5 —*|; b) N(a—*,b +"), 
n=1 Kh nh n-=1 n n 
Problem 8. Let A, be the set of points lying on the curve 


y=— (<x <0). 


a 


What is 


Problem 9. Let y = f(x) = (x) for all real x, where (x) is the fractional 
part of x. Prove that every closed interval of length 1 has the same image 
under f. What is this image? Is f one-to-one? What is the preimage of the 
interval + < y < ?#? Partition the real line into classes of points with the 
same image. 


Problem 10. Given a set M, let & be the set of all ordered pairs on the 
form (a, a) with a € M, and let aRd if and only if (a, b) € &. Interpret the 
relation R. 


Problem 11, Give an example of a binary relation which is 


a) Reflexive and symmetric, but not transitive; 

b) Reflexive, but neither symmetric nor transitive; 
c) Symmetric, but neither reflexive nor transitive; 
d) Transitive, but neither reflexive nor symmétric. 


2. Equivalence of Sets. The Power of a Set 


2.1. Finite and infinite sets. The set of all vertices of a given polyhedron, 
the set of all prime numbers less than a given number, and the set of all 
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residents of New York City (at a given time) have a certain property in 
common, namely, each set has a definite number of elements which can be 
found in principle, if not in.practice. Accordingly, these sets are all said to 
be finite. Clearly, we can be sure that a set is finite without knowing the 
number of elements in it. On the other hand, the set of all positive integers, 
the set of all points on the line, the set of all circles in the plane, and the 
set of all polynomials with rational coefficients have a different property 
in common, namely, if we remove one element from each set, then remove 
two elements, three elements, and so on, there will still be elements Jeft in 
the set at each stage. Accordingly, sets of this kind are said to be infinite. 

Given two finite sets, we can always decide whether or not they have the 
same number of elements, and if not, we can always determine which set 
has more elements than the other. It is natural to ask whether the same is 
true of infinite sets. In other words, does it make sense to ask, for example, 
whether there are more circles in the plane than rational points on the line, 
or more functions defined in the interval [0, 1] than lines in space? As will 
soon be apparent, questions of this kind can indeed be answered. 

To compare two finite sets 4 and B, we can count the number of elements 
in each set and then compare the two numbers, but alternatively, we can try 
to establish a one-to-one correspondence between (the elements of) A and B, 
i.e., a correspondence such that each element in A corresponds to one and 
Only one element in B and vice verse. It is clear that a one-to-one corre- 
spondence between two finite sets can be set up if and only if the two sets 
have the same number of elements. For example, to ascertain whether or 
not the number of students in an assembly is the same as the number of 
seats in the auditorium, there is no need to count the number of students 
and the number of seats. We need merely observe whether or not there are 
empty seats or students with no place to sit down. If the students can all 
be seated with no empty seats left, i.e., if there is a one-to-one correspondence 
between the set of students and the set of seats, then these two sets obviously 
have the same number of elements. The important point here is that the 
first method (counting elements) works only for finite sets, while the second 
method (Setting up a one-to-one correspondence) works for infinite sets as 
well as for finite sets. 


2.2. Countable sets. The simplest infinite set is the set Z, of all positive 
integers. An infinite set is called countable if its elements can be put in one-to- 
one correspondence with those of Z,. In other words, a countable set is a 
set whose elements can be numbered a), a,,...,@,,.... By an uncountable 
set we mean, of course, an infinite set which is not countable. 

We now give some examples of countable sets: 


Example 1. The set Z of all integers, positive, negative or zero, is 
countable. In fact, we can set up the following one-to-one correspondence 
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between Z and the set Z, of all positive integers: 

Os le Ap S25: 2ates 

Lge 2h, Os 4, 5,. 
More explicitly, we associate the nonnegative integer n > 0 with the odd 
number 2 + 1, and the negative integer n < 0 with the even number 2 |x], 
1:65 

n<«2n+ 1 if n> 0, 

n<>2|n| if n<0 
(the symbol <-> denotes a one-to-one correspondence). 

Example 2. The set of all positive even numbers is countable, as shown 

by the obvious correspondence n <> 2n. 


Example 3. The set 2, 4, 8,..., 2",... of powers of 2 is countable, as 
Shown by the obvious correspondence n <> 2”, 


Example 4. The set Q of all rational numbers is countable. To see this, 
we first note that every rational number « can be written as a fraction p/q, 
g > 0 in lowest terms with a positive denominator. Call the sum |p| + g the 
“height’’ of the rational number «, For example, 


: =0 
is the only rational number of height 0, 
= | 
i 
are the only rational numbers of height 2, 
—2 —1 | 2 
co a | 


are the only rational numbers of height 3, and so on. We can now arrange 
all rational numbers in order of increasing height (with the numerators 
increasing in each set of rational numbers of the same height). In other 
words, we first count the rational numbers of height 1, then those of height 
2 (suitably arranged), those of height 3, and so on. In this way, we assign 
every rational number a unique positive integer, i.e., we set up a one-to-one 
correspondence between the set Q of all rational numbers and the set Z, 
of all positive integers. 


Next we prove some elementary theorems involving countable sets: 
THEOREM 1. Every subset of a countable set is countable. 


Proof. Let A be countable, with elements a,, a.,..., and let B be a 
subset of A. Among the elements a,, a2,..., leta,,, @,,,... be those in 
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the set B. If the set of numbers 7,, n,,... has a largest number, then 
B is finite. Otherwise B is countable (consider the correspondence 
icva,). I 


THEOREM 2. The union of a finite or countable number of countable 
sets A,, Ag,... is itself countable. 


* Proof. We can assume that no two of the sets A,, Aj,... have 
elements in common, since otherwise we could consider the sets 


Ay, Ay ~~ Ay, As — (A, U Ag), coe 


instead, which are countable by Theorem 1 and have the same union as 
the original sets. Suppose we write the elements of A,, Aj,... in the 
form of an infinite table 


Q3, Gg, Agg Agq... (1) 


where the elements of the set 4, appear in the first row, the elements of 
the set A, appear in the second row, and so on. We now count all the 
elements in (1) “diagonally,”’ i.e., first we choose a,,, then @,., then dg,, 
and so on, moving in the way shown in the following table:> 


Qy1-* Qn = Ayg > Ay -- 


i“ fk 

Go, Goo Gog GAog-.- 

(F< 

G3, 432 «3g gg ee - (2) 
va 


It is clear that this procedure associates a unique number to each element 
in each of the sets A,, A,,..., thereby establishing a one-to-one 
correspondence between the union of the sets 4,, A,,... and the set 
Z., of all positive integers. Jj 


THEOREM 3. Every infinite set has a countable subset. 


* Discuss the obvious modifications of (1) and (2) in the case of only a finite number 
of sets A,, Ao,.... 
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Proof. Let M be an infinite set and a, any element of M. Being in- 
finite, M contains an element a, distinct from a,, an element a, distinct 
from both a, and az, and so on. Continuing this process (which can 
never terminate due to a “‘shortage’’ of elements, since M is infinite), 
we get a countable subset 


7. er a 
of the set M. J 


Remark. Theorem 3 shows that countable sets are the “‘smallest’’ infinite 
sets. The question of whether there exist uncountable (infinite) sets will be 
considered below. 


2.3. Equivalence of sets. We arrived at the notion of a countable set M 
by considering one-to-one correspondences between M and the set Z, of all 
positive integers. More generally, we can consider one-to-one correspondences 
between any two sets M and N: 


DEFINITION. Zwo sets M and N are said to be equivalent (written 
M ~ N) if there ts a one-to-one correspondence between the elements of 
M and the elements of N. 


The concept of equivalence® is applicable to both finite and infinite sets. 
Two finite sets are equivalent if and only if they have the same number of 
elements. We can now define a countable set as a set equivalent to the set 
Z,, of all positive integers. It is clear that two sets which are equivalent to a 
third set are equivalent to each other, and in particular that any two countable 
sets are equivalent. 


Example 1. The sets of points in any two 
closed intervals [a, b] and [c, d] are equiv- 
alent, and Figure 5 shows how to set up a 
one-to-one correspondence between them. 
Here two points p and g correspond to each 
other if and only if they lie on the same ray 
emanating from the point O in which the 
extensions of the line segments ac and bd 
intersect. 


Example 2. The set of all points z in the FIGURE 5 
complex plane is equivalent to the set of all 


§ Not to be confused with our previous use of the word in the phrase ‘“‘equivalence 
relation.”’ However, note that set equivalence is an equivalence relation in the sense of 
Sec. 1.4, being obviously reflexive, symmetric and transitive. Hence any family of sets 
can be partitioned into classes of equivalent sets. 
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‘points « on a sphere. In fact, a one-to- 
one correspondence z <> « between the 
points of the two sets can be established 
by using stereographic projection, as 
shown in Figure 6 (O is the north pole 
of the sphere). 


Example 3. The set of all points x 
FIGURE 6 in the open unit interval (0, 1) is equiv- 
alent to the set of all points y on the 
whole real line. For example, the formula 


y= SaeGnk eh 
TT 2 


establishes a one-to-one correspondence between these two sets. 


The last example and the examples in Sec. 2.2 show that an infinite set 
is sometimes equivalent to one of its proper subsets. For example, there are 
“as many” positive integers as integers of arbitrary sign, there are “‘as many”’ 
points in the interval (0, 1) as on the whole real line, and so on. This fact 
is characteristic of all infinite sets (and can be used to define such sets), as 
shown by 


THEOREM 4. Every infinite set is equivalent to one of its proper subsets. 


Proof. According to Theorem 3, any infinite set Mf contains a 
countable subset. Let this subset be 


AS Op 3Oay 555 Gare tay 
and partition A into two countable subsets 


A, = {a,, a3, as, eo. ee Ag = {as, Q4; Q¢6> o a 


Obviously, we can establish a one-to-one correspondence between the 
countable sets A and A, (merely let a,< + a@,,_,). This correspondence 
can be extended to a one-to-one correspondence between the sets A U 
(M — A) = M and A, U (M — A) = M — A, by simply assigning x 
itself to each element x € M — A. But M — A, is a proper subset of 
M. J 


2.4, Uncountability of the real numbers. Several examples of countable 
sets were given in Sec. 2.2, and many more examples of such sets could be 
given. In fact, according to Theorem 2, the union of a finite or countable 
number of countable sets is itself countable. It is now natural to ask whether 
there exist infinite sets which are uncountable. The existence of such sets 
is shown by 
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THEOREM 5. The set of real numbers in the closed unit interval (0, 1] is 
uncountable. 


Proof. Suppose we have somehow managed to count some or all of 
the real numbers in [0, 1], arranging them in a list 


%1 = 0.411415 Ain . 
&, = 0.a,1e0 ee 
ee er a eee (3) 
On = OE Opes 04 Oped ois 


oo e8© @¢ ® e@© e® ® @ ee @® e® @ @e# @e# @ 


where a, 1s the kth digit in the decimal expansion of the number g;. 
Consider the decimal 


B= 0.byb....b,... (4) 


constructed as follows: For b, choose any digit (from 0 to 9) different 
from a,,, for b, any digit different from a2, and so on, and in general 
for b, any digit different from a,,,. Then the decimal (4) cannot coincide 
with any decimal in the list (3). In fact, 6 differs from a, in at least the 
first digit, from a, in at least the second digit, and so on, since in general 
b,, # Ayn for all n. Thus no list of real numbers in the interval [0, 1] 
can include all the real numbers in [0, 1]. 

The above argument must be refined slightly since certain numbers, 
namely those of the form p/10*, can be written as decimals in two ways, 
either with an infinite run of zeros or an infinite run of nines. For 
example, | 


noe 


= 5, = 0.5000... = 0.4999...., 


so that the fact that two decimals are distinct does not necessarily mean 
that they represent distinct real numbers. However, this difficulty 
disappears if in constructing 8, we require that 6 contain neither zeros 
nor nines, for example by setting 6, =2 if a,, = 1 and 6, =1 if 


Onn Fl. Gf 


Thus the set [0, 1] is uncountable. Other examples of uncountable sets 
equivalent to [0, 1] are 


1) The set of points in any closed interval [a, 5]; 

2) The set of points on the real line; 

3) The set of points in any open interval (a, 5); 

4) The set of all points in the plane or in space; 

5) The set of all points on a sphere or inside a sphere; 

6) The set of all lines in the plane; 

7) The set of all continuous réal functions of one or several variables. 
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The fact that the sets 1) and 2) are equivalent to [0, 1] is proved as in Examples 
1 and 3, pp. 13 and 14, while the fact that the sets 3)-7) are equivalent 
to [0, 1] is best proved indiréctly (cf. Problems 7 and 9). 


2.5. The power of a set. Given any two sets M and N, suppose M and N 
are equivalent. Then M and N are said to have the same power. Roughly 
speaking, “‘power’’ is something shared by equivalent sets. If M and N are 
finite, then M and N have the same number of elements, and the concept 
of the power of a set reduces to the usual notion of the number of elements 
in a set. The power of the set Z, of all positive integers, and hence the power 
of any countable set, is denoted by the symbol No, read “aleph null.”’ A 
set equivalent to the set of real numbers in the interval [0, 1], and hence to 
the set of a// real numbers, is said to have the power of the continuum, 
denoted by c (or often by §). 

For the powers of finite sets, i.e., for the positive integers, we have the 
notions of “greater than’’ and “‘less than,”’ as well as the notion of equality. 
We now show how these concepts are extended to the case of infinite sets. 

Let A and B be any two sets, with powers m(A) and m(B), respectively. 
If A is equivalent to B, then m(A) = m(B) by definition. If A is equivalent 
to a subset of B and if no subset of A is equivalent to B, then, by analogy 
with the finite case, it is natural to regard m(A) as less than m(B) or m(B) as 
greater than m(A). Logically, however, there are two further possibilities: 


a) B has a subset equivalent to A, and A has a subset equivalent to B; 
b) A and B are not equivalent, and neither has a subset equivalent to the 
other. 


In case a), A and B are equivalent and hence have the same power, as shown 
by the Cantor-Bernstein theorem (Theorem 7 below). Case b) would obvi- 
ously show the existence of powers that cannot be compared, but it follows 
from the well-ordering theorem (see Sec. 3.7) that this case is actually impos- 
sible. Therefore, taking both of these theorems on faith, we see that any two 
sets A and B either have the same power or else satisfy one of the rela- 
tions m(A) < m(B) or m(A) > m(B). For example, it is clear that &, < c 
(why ?). 


Remark. The very deep problem of the existence of powers between &, 
and c is touched upon in Sec. 3.9. As a rule, however, the infinite sets 
encountered in analysis are either countable or else have the power of the 
continuum. 


We have already noted that countable sets are the “‘smallest’’ infinite 
sets. It has also been shown that there are infinite sets of power greater 
than that of a countable set, namely sets with the power of the continuum. 
It is natural to ask whether there are infinite sets of power greater than that 
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of the continuum or, more generally, whether there is a “largest’’ power. 
These questions are answered by 


THEOREM 6. Given any set M, let 4 be the set whose elements are all 
possible subsets of M. Then the power of 4 is greater than the power of 
the original set M. 


Proof. Clearly, the power wu of the set 4 cannot be less than the power 
m of the original set M, since the “‘single-element subsets”’ (or “‘single- 
tons’) of M form a subset of equivalent to M. Thus we need only 
show that m and u donot coincide. Suppose a one-to-one correspondence 


awA, bosB,... 


has been established between the elements a,b,... of M and certain 
elements A, B,... of @ (ie., certain subsets of M). Then A, B,... 
do not exhaust all the elements of 4, i.e., all the subsets of M. To see 
this, let X¥ be the set of elements of M which do not belong to their 
“associated subsets.’’ More exactly, if a<> A we assign ato X ifa¢ A, 
but not ifaeA. Clearly, X is a subset of M and hence an element of 4. 
Suppose there is an element x ¢ M such that x<> X, and consider 
whether or not x belongs to X. Suppose x ¢ X. Then x € X, since, by 
definition, X contains every element not contained in its associated 
subset. On the other hand, suppose x ¢ X. Then x € X, since X con- 
sists precisely of those elements which do not belong to their associated 
subsets. In any event, the element x corresponding to the subset X must 
simultaneously belong to X and not belong to X. But this is impossible! 
It follows that there is no such element x. Therefore no one-to-one cor- 
respondence can be established between the sets M and .4@, i.e., 


mAp ff 


Thus, given any set M, there is a set .# of larger power, a set .4#* of 
still larger power, and so on indefinitely. In particular, there is no set of 
““‘largest’’ power. 


2.6. The Cantor-Bernstein theorem. Next we prove an important theorem 
already used in the preceding section: 


THEOREM 7 (Cantor-Bernstein). Given any two sets A and B, suppose 
A contains a subset A, equivalent to B, while B contains a subset B, 
equivalent to A. Then A and B are equivalent. 


Proof. By hypothesis, there is a one-to-one function f mapping A 
into B, and a one-to-one function g mapping B into A,: 
fA=B,C B g(B=A4,C A. 
Therefore 


A, = gf (A) = g(f(A)) = g(B) 
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is a subset of A, equivalent to all of A. Similarly, 


B, = fg(B) = f(g(B)) =f (Ay) 


is a subset of B, equivalent to B. Let A; be the subset of A into which 
the mapping gf carries the set 4,, and let A, be the subset of A into which 
gf carries Ay. More generally, let A,,. be the set into which A, (k = 
1,2,...) is carried by gf Then clearly 


ADA 2 ApS AS ALS Aa Se 
Setting 
D=NA,, 
k=] 


we can represent A as the following union of pairwise disjoint sets: 


A = (A — Ay) U (A, — Ap) U (Ag — Ag) Uo 
U (A, — Ans) Urs UD. (5) 


Similarly, we can write A, in the form 
A; = (Ay — Az) U (Ap — A3) UU (Ag = Ager) VU UD. (6) 
Clearly, (5) and (6) can be rewritten as 
A=DUMUN, (5’) 


A,=DUMUMN,, (6’) 
where 
M = (A, — Ag) U (A3 — Ag) U'' =" 
N= (A —A,) U (An — 43) Uo —, 
N, = (4, — Az) VU (Ag — As) Uso’. 


But A — A, is equivalent to A, — A; (the former is carried into the latter 
by the one-to-one function gf), A, — Ag 1s equivalent to 4, — A;, and 
so on. Therefore N is equivalent to N,. It follows from the represen- 
tations (5’) and (6’) that a one-to-one correspondence can be set up 
between the sets A and A,. But A, is equivalent to B., by hypothesis. 
Therefore A is equivalent to B. J 


Remark. Here we can even “‘afford the unnecessary luxury” of explicitly 
writing down a one-to-one function carrying A into B, 1.e., 


gi(a) if aeDUM, 


a f(a) if aeEDUN 


(see Figure 7). 
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FIGURE 7 


Problem 1. Prove that a set with an uncountable subset is itself un- 
countable. 


Problem 2. Let M be any infinite set and A any countable set. Prove that 
M~MUA. 


Problem 3. Prove that each of the following sets is countable: 


a) The set of all numbers with two distinct decimal expansions (like 
0.5000... and 0.4999 . . .); 

b) The set of all rational points in the plane (1.e., points with rational 
coordinates); 

c) The set of all rational intervals (i.e., intervals with rational end points) ; 

d) The set of all polynomials with rational coefficients. 


Problem 4. A number « is called algebraic if it is a root of a polynomial 
equation with rational coefficients. Prove that the set of all algebraic numbers 
is countable. 


Problem 5. Prove the existence of uncountably many transcendental num- 
bers, i.e., numbers which are not algebraic. 


Hint. Use Theorems 2 and S. 


Problem 6. Prove that the set of all real functions (more generally, 
functions taking values in a set containing at least two elements) defined 
on a set M is of power greater than the power of M. In particular, prove 
that the power of the set of a// real functions (continuous and discontinuous) 
defined in the interval [0, 1] is greater than c. 


Hint. Use the fact that the set of all characteristic functions (i.e., functions 
taking only the values 0 and 1) on M is equivalent to the set of all subsets 
of M. 


Problem 7, Give an indirect proof of the equivalence of the closed interval 
[a, b], the open interval (a, b) and the half-open interval [a, b) or (a, 5]. 
Hint. Use Theorem 7. 
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Problem 8. Prove that the union of a finite or countable number of sets 
each of power c is itself of power c. 


Problem 9. Prove that each of the following sets has the power of the 
continuum: 


a) The set of all infinite sequences of positive integers; 
b) The set of all ordered n-tuples of real numbers; 
c) The set of all infinite sequences of real numbers. 


Problem 10. Develop a contradiction inherent in the notion of the “‘set 
of all sets which are not members of themselves.”’ 


Hint. Is this set a member of itself? 


Comment. Thus we will be careful to avoid sets which are “‘too big,”’ like 
the “‘set of all sets.”’ 


3. Ordered Sets and Ordinal Numbers 


3.1. Partially ordered sets. A binary relation R on a set M is said to be a 
partial ordering (and the set M itself is said to be partially ordered) if 


1) R is reflexive (aRa for every ac M); 
2) RK is transitive (aRb and bRc together imply @Rc); 
3) Ris antisymmetric in the sense that aRb and bRa together imply a = 8. 


For example, if M is the set of all real numbers and @Rb means a < 3, then 
R is a partial ordering. This suggests writing a < 5 (or equivalently b > a) 
instead of aRb whenever R is a partial ordering, and we will do so from now 
on. Similarly, we writea <bifa<b,aAbandb>aifbsa,b Fa. 

The following examples give some idea of the generality of the concept 
of a partial ordering: 


Example I. Any set M can be partially ordered in a trivial way by setting 
a < bif and only ifa = bd. 


Example 2. Let M be the set of all continuous functions f, g, .. . defined 
in a closed interval [«, 8]. Then we get a partial ordering by setting f < g 
if and only if f(t) < g(t) for every te [«, 8]. 


Example 3. The set of all subsets M@,, M,,... is partially ordered if 
M, < M, means that M, < Mg. 


Example 4. The set of all integers greater than | is partially ordered if 
a < b means that “‘d is divisible by a.”’ 
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An element a of a partially ordered set is said to be maximal if a < 6 
implies b = a and minimal if b < a implies b = a, Thus in Example 4 every 
prime number (greater than 1) is a minimal element. 


3.2. Order-preserying mappings. Isomorphisms. Let M and M’ be any 
two partially ordered sets, and let f be a one-to-one mapping of M onto M’. 
Then fis said to be order-preserving if a < b (where a, b € M) implies f(a) < 
f(b) (in M’), An order-preserving mapping f such that f(a) < f(b) implies 
a < bis called an isomorphism. In other words, an isomorphism between 
two partially ordered sets M and M’ is a one-to-one mapping of M onto M’ 
such that f(a) < f(b) if and only if a < b. Two partially ordered sets M 
and M’ are said to be isomorphic (to each other) if there exists an isomorphism 
between them. 


Example. Let M be the set of positive integers greater than | partially 
ordered as in Example 4, Sec. 3.1, and let M/’ be the same set partially ordered 
in the natural way, i.e., in such a way that a < b if and only if b — ais 
nonnegative. Then the mapping of M onto M’ carrying every integer n 
into itself is order-preserving, but not an isomorphism. 


Isomorphism between partially ordered sets is an equivalence relation 
as defined in Sec. 1.4, being obviously reflexive, symmetric and transitive. 
Hence any given family of partially ordered sets can be partitioned into 
disjoint classes of isomorphic sets.” Clearly, two isomorphic partially 
ordered sets can be regarded as identical in cases where it is the structure 
of the partial ordering rather than the specific nature of the elements of the 
sets that is of interest. 


3.3. Ordered sets. Order types. Given two elements a and b of a partially 
ordered set M, it may turn out that neither of the relations a< borb<a 
holds. In this case, a and 0 are said to be noncomparable. Thus, in general, 
the relation < is defined only for certain pairs of elements, which is why M 
is said to be partially ordered. However, suppose M has no noncomparable 
elements. Then M is said to be ordered (synonymously, simply or linearly 
ordered). In other words, a set M is ordered if it is partially ordered and if, 
given any two distinct elements a, b € M, either a < b orb <a. Obviously, 
any subset of an ordered set is itself ordered. 

Each of the sets figuring in Examples 1-4, Sec. 3.1 is partially ordered, 
but not ordered. Simple examples of ordered sets are the set of all positive 
integers, the set of all rational numbers, the set of all real numbers in the 


” Note that we avoid talking about the ‘‘family of all partially ordered sets” (recall 
Problem 10, p. 20). 
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interval [0, 1], and so on (with the usual relations of “‘greater than’’ and “‘less 
than’’). 

Since an ordered set is a special kind of partially ordered set, the concepts 
of order-preserving mapping and isomorphism apply equally well to ordered 
sets. Two isomorphic ordered sets are said to have the same (order) type. 
Thus “type’’ is something shared by all isomorphic ordered sets, just as 
‘“power’’ is something shared by all equivalent sets (considered as “plain’’ 
sets, without regard for possible orderings). 

The simplest example of an ordered set is the set of all positive integers 
1,2,3,... arranged in increasing order, with the usual meaning of the 
symbol <. The order type of this set is denoted by the symbol w. Two iso- 
morphic ordered sets obviously have the same power (an isomorphism is a 
one-to-one correspondence). Thus it makes sense to talk about the power 
corresponding to a given order type. For example, the power &, corresponds 
to the order type w. The converse is not true, since a set of a given power can 
in general be ordered in many different ways. It is only in the finite case that 
the number of elements in a set uniquely determines its type, designated by 
the same symbol n as the number of elements in the set. For example, 
besides the “‘natural’’ order type w of the set of positive integers, there is 
another order type corresponding to the sequence 


Ls Dera 2 4s Os ees 


where odd and even numbers are separately arranged in increasing order, 
but any odd number precedes any even number. It can be shown that the 
number of distinct order types of a set of power X, is infinite and in fact 
uncountable. 


3.4. Ordered sums and products of ordered sets. Let MM, and M, be two 
ordered sets of types 8, and §,, respectively. Then we can introduce an 
ordering in the union M, U M, of the two sets by assuming that 


1) a and 5 have the same ordering as in M, ifa, be M,; 
2) a and b have the same ordering as in M, if a, be M,; 
3)a<bifaeM,,bEM, 


(verify that this is actually an ordering of M, U M,). The set M, UM, 
ordered in this way is called the ordered sum of M, and M,, denoted by 
M, + M,. Note that the order of terms matters here, i.e., in general M, + M, 
is not isomorphic to M, + M,. More generally, we can define the ordered 
sum of any finite number of ordered sets by writing (cf. Problem 6) 


M,+ M,+ Mz; = (M, + M2) + Msg, 
M,+ M.+ M,;+ M,=(M,+ M,+ M3)+ M,, 
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and so on. By the ordered sum of the types 8, and 0,, denoted by 8, + 9, 
we mean the order type of the set M, + Msg. 


Example. Consider the order types w and n. It is easy to see that 
n+ =o. In fact, if finitely many terms are written to the left of the 
sequence 1,2,...,k,..., we again get a set of the same type (why ?). 
On the other hand, the order type w + 2, i.e., the order type of the set® 


eee a ee » A, Go, -.+ 5 Ants 
is obviously not equal to w. 


Again let-M, and M, be two ordered sets of types 9, and 03, respectively. 
Suppose we replace each element of M, by a “‘replica’’ of the set M,. Then 
the resulting set, denoted by M,° Mp, is called the ordered product of M, 
and M,. More exactly, M,~- M, is the set of all pairs (a, b) where ae M,, 
b € Mz, ordered in such a way that 


1) (a,, 61) < (ae, 5.) if by < by (for arbitrary a,, a); 
2) (a,, 6) < (ag, b) if ay < as. 


Note that the order of factors matters here, i.e., in general M,- M, is not 
isomorphic to M, + M,. The ordered product of any finite number of ordered 
sets can be defined by writing (cf. Problem 6) 


M,: M,* M; = (M,- M2): Msg, 
M,° M.: Ms° My = (M,° Mg: M3)° Mg, 
and so on. By the ordered product of the types 89, and 9,, denoted by 9, - 9s, 
we mean the order type of the set M,- My. 
3.5. Well-ordered sets. Ordinal numbers. A key concept in the theory of 
ordered sets is given by 


DEFINITION 1. An ordered set M is said to be well-ordered if every 
nonempty subset A of M has a smallest (or “‘first’’) element, i.e., an element 
u such that » <a for everyae A. 


Example 1. Every finite ordered set is obviously well-ordered. 


Example 2. Every nonempty subset of a well-ordered set is itself well- 
ordered. 


Example 3. The set M or rational numbers in the interval [0, 1] is ordered 
but not well-ordered. It is true that Mf has a smallest element, namely the 


8 Here we use the same curly bracket notation as in Sec. 1.1, but the order of terms 
is now crucial. 
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number 0, but the subset of M consisting of all positive rational numbers 
has no smallest element. 


DEFINITION 2. The order type of a well-ordered set is called an ordinal 
number or simply an ordinal.® If the set is infinite, the ordinal is said to be 
transfinite. 


Example 4. The set of positive integers 1,2,...,k,... arranged in 
increasing order is well-ordered, and hence its order type is a (transfinite) 
ordinal. The order type w + n of the set 


ge rican a eee PGivQopses Oat 
is also an ordinal. 


Example 5. The set 
{...,—k,..., —3, —2, —1} (1) 


is ordered but not well-ordered. It is true that any nonempty subset A of 
(1) has a largest element (i.e., an element v such that a < v for every a € A), 
but in general A will not have a smallest element. In fact, the set (1) itself 
has no smallest element. Hence the order type of (1), denoted by w*, is not 
an ordinal number. 


THEOREM |. The ordered sum of a finite number of well-ordered sets 
M,, Mz,..., M,, is itself a well-ordered set. 


Proof. Let M be an arbitrary subset of the ordered sum M, + M, + 
-++-+ M,, and let M, be the first of the sets M,, M,,..., M,, (namely 
the set with smallest index) containing elements of M. Then M1 M,, 
is a subset of the well-ordered set ,, and as such has a smallest element 
u. Clearly is the smallest element of M itself. J 


COROLLARY. The ordered sum of a finite number of ordinal numbers is 
itself an ordinal number. 


Thus new ordinal numbers can be constructed from any given set of 
ordinal numbers. For example, starting from the positive integers (i.e., the 
finite ordinal numbers) and the ordinal number w, we can construct the new 
ordinal numbers 


O+n, wO+o0, O+ot+n, oO+04+ 40, 
and so on. 


THEOREM 2. Zhe ordered product of two well-ordered sets M, and Mz 


is itself a well-ordered set. 


° This is a good place to point out that the terms “‘cardinal number” and “‘power”’ 
(of a set) are synonymous. 
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Proof. Let M be an arbitrary subset of M,- M,, so that M isa set of 
ordered pairs (a, b) witha € M,,b € My. The set of all second elements 5 
of pairs in M is a subset of M,, and as such has a smallest element since 
M, is well-ordered. Let 5, denote this smallest element, and consider 
all pairs of the form (a, b,) contained in M. The set of all first elements 
a of these pairs is a subset of ,, and as such has a smallest element 
since M, is well-ordered. Let a, denote this smallest element. Then the 
pair (@,, 5,) is clearly the smallest element of M. Jj 


COROLLARY 1. The ordered product of a finite number of well-ordered 
sets is itself a well-ordered set. 


COROLLARY 2. The ordered product of a finite number of ordinal num- 
bers is itself an ordinal number. 


Thus it makes sense to talk about the ordinal numbers 
O° n, w?, w@?°n, w, 
and so on. It is also possible to define such ordinal numbers as?° 


@ 
o°,w®° 9°e0e 


3.6. Comparison of ordinal numbers. If 1, and n, are two finite ordinal 
numbers, then they either coincide or else one is larger than the other. As 
we now show, the same is true of transfinite ordinal numbers. We begin by 
observing that every element a of a well-ordered set M determines an (initial) 
section P, the set of all x € M such that x < a, and a remainder Q, the set 
of all x e M such that x > a. Given any two ordinal numbers « and 8, let 
M and N be well-ordered sets of order type « and 8, respectively. Then we 
say that 

1) « = B if M and N are isomorphic; 

2) « < 6 if M is isomorphic to some section of N; 

3) a > B if Nis isomorphic to some section of M 


(note that this definition makes sense for finite « and 8). 


LemMA. Let f be an isomorphism of a well-ordered set A onto some 
subset B < A, Then f(a) > a forallaeA. 


Proof. If there are elements a € A such that f(a) < a, then there is a 
least such element since A is well-ordered. Let a, be this element, and 
let by = f(a). Then by < ao, and hence f(y) < f(a) = bg since fis an 
isomorphism. But then a, is not the smallest element such that f(a) < a. 
Contradiction! J 


19 See e.p., A. A. Fraenkel, Abstract Set Theory, third edition, North-Holland Pub- 
lishing Co., Amsterdam (1966), pp. 202-208. 
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It follows from the lemma that a well-ordered set A cannot be iso- 
morphic to any of its sections, since if A were isomorphic to the section 
determined by a, then clearly f(a) < a. In other words, the two relations 


a= f, a<B 
are incompatible, and so are 

“a= B, % > 6. 
Moreover, the two relations 

a< 6, “x >B 


are incompatible, since otherwise we could use the transitivity to deduce 
a <a, which is impossible by the lemma. Therefore, if one of the three 
relations 

a< 8, “x= 6, a> B (2) 


holds, the other two are automatically excluded. We must still show that 
one of the relations (2) always holds, thereby proving that any two ordinal 
numbers are comparable. 


THEOREM 3. Two given ordinal numbers « and 8 satisfy one and only 
one of the relations 
a< 8, a= B, a > B. 


Proof. Let W(«) be the set of all ordinals <a«. Any two numbers 
y and y’ in W(«) are comparable! and the corresponding ordering of 
W(a«) makes it a well-ordered set of type «. In fact, if a set 


pe een | Saree |e 


is of type «, then by definition, the ordinals less than « are the types of 
well-ordered sets isomorphic to sections of A. Hence the ordinals them- 
selves are in one-to-one correspondence with the elements of A. In other 
words, the elements of a set of type « can be numbered by using the 
ordinals less than a: 


AoA NG Gay be Dean 


Now let « and B be any two ordinals. Then W(«) and W(§) are well- 
ordered sets of types « and 8, respectively. Moreover, let C= A OB 
be the intersection of the sets 4 and B, i.e., the set of all ordinals less than 
both « and 8. Then C is well-ordered, of type y, say. We now show that 
y < «. If C = A, then obviously y = «. On the other hand, if C ¥ A, 
then Cis a section of A and hence y < «. In fact, let EC, nEA —C. 
Then & and y are comparable, i.e.,either§ < yor§ > y. Butyn< E<u 


11 Recall the meaning of y < «, y’ < a, and use the fact that a section of a section of 
a well-ordered set is itself a section of a well-ordered set. 
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is impossible, since then y¢C. Therefore § < y and hence C is a 
section of A, which implies y < «. Moreover, y is the first element of 
the set d — C. Thus y < ag, as asserted, and similarly y < 8. The case 
y¥ <a, y <® is impossible, since then ye A —C, ye B-—C. But 
then y ¢€C on the one hand and y¢ A MW B=C on the other hand. 
It follows that there are only three possibilities 


Y= &, y=B, a= B, 
Y= &, y <6, a< 6, 
Y< 4, y= 6, a > B, 


i.e., and 6 are comparable. fj 


THEOREM 4, Let A and B be well-ordered sets. Theneither A is equivalent 
to B or one of the sets is of greater power than the other, i.e., the powers 
of A and B are comparable. 


Proof. There is a definite power corresponding to each ordinal. But 
we have just seen that ordinals are comparable, and so are the corre- 
sponding powers (recall the definition of inequality of powers given in 
Sec. 2.5). ff 


3.7. The well-ordering theorem, the axiom of choice and equivalent asser- 
tions. Theorem 4 shows that the powers of two well-ordered sets are always 
comparable. In 1904, Zermelo succeeded in proving the 


WELL-ORDERING THEOREM. Every set can be well-ordered. 


It follows from the well-ordering theorem and Theorem 5 that the powers of 
two arbitrary sets are always comparable, a fact already used in Sec. 2.5. 
Zermelo’s proof, which will not be given here,!” rests on the following basic 


AXIOM OF CHOICE. Given any set M, there is a “choice function” f such 
that f (A) is an element of A for every nonempty subset A — M. 


We will assume the validity of the axiom of choice without further ado. 
In fact, without the axiom of choice we would be severely hampered in 
making set-theoretic constructions. However, it should be noted that from 
the standpoint of the foundations of set theory, there are still deep and 
controversial problems associated with the use of the axiom of choice. 

There are a number of assertions equivalent to the axiom of choice, L.e., 
assertions each of which both implies and is implied by the axiom of choice. 
One of these is the well-ordering theorem, which obviously implies the axiom 
of choice. In fact, if an arbitrary set M can be well-ordered, then, by merely 
choosing the “‘first’’ element in each subset A < M, we get the function f(A) 


12 A. A. Fraenkel, op. cit., pp. 222-227. 
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figuring in the statement of the axiom of choice. On the other hand, the 
axiom of choice implies the well-ordering theorem, as already noted without 
proof. ye 

To state further assertions equivalent to the axiom of choice, we need 
some more terminology: 


" DEFINITION 3, Let M be a partially ordered set, and let A be any subset 
of M such that a and b are comparable for every a,b € A. Then A is called 
a chain (in M). A chain C is said to be maximal if there is no other chain C' 
in M containing C as a proper subset. 


DEFINITION 4. An element a of a partially ordered set M is called an 
upper bound of a subset M' < M ifa’ < a for every a’ € M’, 


We now have the vocabulary needed to state two other assertions equiv- 
alent to the axiom of choice: 


HAUSDORFF’S MAXIMAL PRINCIPLE. Every chain in a partially ordered 
set M is contained in a maximal chain in M. 


ZORN’S LEMMA. If every chain in a partially ordered set M has an upper 
bound, then M contains a maximal element. 


For the proof of the equivalence of the axiom of choice, the well-ordering 
theorem, Hausdorff’s maximal principle and Zorn’s lemma, we refer the 
reader elsewhere.1* Of these various equivalent assertions, Zorn’s lemma is 
perhaps the most useful. 


3.8. Transfinite induction. Mathematical propositions are very often 
proved by using the following familiar 


THEOREM 4 (Mathematical induction). Given a proposition P(n) formu- 
lated for every positive integer n, suppose that 


1) PC.) is true; 

2) The validity of P(k) for all k < n implies the validity of P(n + 1). 
Then P(n) is true for alln =1,2,... 

Proof. Suppose P(n) fails to be true for alln = 1,2,..., and let 
n, be the smallest integer for which P(n) is false (the existence of n, 
follows from the well-ordering of the positive integers). Clearly n, > 1, 


so that n, — 1 is a positive integer. Therefore P(m) is valid for all 
k <n, — 1 but not for n,. Contradiction! J 


Replacing the set of all positive integers by an arbitrary well-ordered set, 


13 See e.g., G. Birkhoff, Lattice Theory, third edition, American Mathematical Society, 
Providence, R.I. (1967), pp. 205-206. 
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we get 


THEOREM 4’. (Transfinite induction). Given a well ordered set A, let 
P(a) be a proposition formulated for every element ac A. Suppose that 


1) P(a) is true for the smallest element of A; 
2) The validity of P(a) for all a < a* implies the validity of P(a*). 


Then P(a) is true for allae A. 


Proof. Suppose P(a) fails to be true tor all ac A. Then P(a) is false 
for all a in some nonempty subset A* < A. By the well-ordering, A* 
has a smallest element a*. Therefore P(a) is valid for all a < a* but 
not for a*. Contradiction! J 


Remark. Since any set can be well-ordered, by the well-ordering theorem, 
transfinite induction can in principle be applied to any set M whatsoever. 
In practice, however, Zorn’s lemma is a more useful tool, requiring only that 
M be partially ordered. 


3.9. Historical remarks. Set theory as a branch of mathematics in its 
own right stems from the pioneer work of Georg Cantor (1845-1918). 
Originally met with disbelief, Cantor’s ideas subsequently became widespread. 
By now, the set-theoretic point of view has become standard in the most 
diverse fields of mathematics. Basic concepts, like groups, rings, fields, linear 
spaces, etc. are habitually defined as sets of elements of an arbitrary kind 
obeying appropriate axioms. 

Further development of set theory led to a number of logical difficulties, 
which naturally gave rise to attempts to replace “‘naive’’ set theory by a more 
rigorous, axiomatic set theory. It turns out that certain set-theoretic questions, 
which would at first seem to have “‘yes’’ or “no’’ answers, are in fact of a 
different kind. Thus it was shown by Gédel in 1940 that a negative answer 
to the question “Is there an uncountable set of power less than that of the 
continuum”’ is consistent with set theory (axiomatized in a way we will not 
discuss here), but it was recently shown by Cohen that an affirmative answer 
to the question is also consistent in the same sense! 


Problem I. Exhibit both a partial ordering and a simple ordering of the 
set of all complex numbers. 


Problem 2. What is the minimal element of the set of all subsets of a 
given set X, partially ordered by set inclusion. What is the maximal element? 


Problem 3. A partially ordered set M is said to be a directed set if, given 
any two elements a, b € M, there is an elementc e¢ M suchthata<c,b<ce. 
Are the partially ordered sets in Examples 1-4, Sec. 3.1 all directed sets? 


14 For example, the set of all transfinite ordinals less than a given ordinal. 
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Problem 4. By the greatest lower bound of two elements a and b of a 
partially ordered set 7, we mean an element ce¢ M such thatc < a,c < b 
and there is no element de M such thatc <d<a,d< b. Similarly, by 
the least upper bound of a and b, we mean an element c € M such that a < c, 
6 <c and there is no element de M such thata<d<c,b<d.Bya 
lattice is meant a partially ordered set any two element of which have both 
a greatest lower bound and a least upper bound. Prove that the set of all 
subsets of a given set X, partially ordered by set inclusion, is a lattice. What 
is the set-theoretic meaning of the greatest lower bound and least upper 
bound of two elements of this set? 


Problem 5. Prove that an order-preserving mapping of one ordered set 
onto another is automatically an isomorphism. 


Problem 6. Prove that ordered sums and products of ordered sets are 
associative, i.e., prove that if M4,, M, and M, are ordered sets, then 


(M, + M,) + M; = M,+ (M, + M3), (M,: M,):M,= M,: (M,: M3), 
where the operations + and - are the same as in Sec. 3.4. 


Comment. This allows us to drop parentheses in writing ordered sums 
and products. 


Problem 7. Construct well-ordered sets with ordinals 


oO+n, O+0, wo+o+n, O+0+0,... 
Show that the sets are all countable. 
Problem 8. Construct well-ordered sets with ordinals 
On, 7, OF -n, wy... 
Show that the sets are all countable. 
Problem 9. Show that 
o+ w=: 2, oto+tw=—w-:3,... 


Problem 10. Prove that the set W(a) of all ordinals less than a given 
ordinal « is well-ordered. 


Problem 11. Prove that any nonempty set of ordinals is well-ordered. 


Problem 12. Prove that the set M of all ordinals corresponding to a 
countable set is itself uncountable. 


Problem 13. Let %, be the power of the set M in the preceding problem. 
Prove that there is no power m such that Xj << m < Xj. 


SEC. 4 SYSTEMS OF SETS 31 


4. Systems of Sets1® 


4,1. Rings of sets. By a system of sets we mean any set whose elements 
are themselves sets. Unless the contrary is explicitly stated, the elements 
of a given system of sets will be assumed to be certain subsets of some fixed 
set X. Systems of sets will usually be denoted by capital script letters like 
BR, SF, etc. Our chief interest will be systems of sets which have certain 
closure properties under the operations introduced in Sec. 1.1. 


DEFINITION 1. A nonempty system of sets B& is called a ring (of sets) if 
AK BE BandANBE & whenever ACL, BES. 
Since 
AUB=(AAB)A (A NB), 
A—B=AA(ANB), 
we also have A UBe & and A —Be & whenever ACH, Be &. 
Thus a ring of sets is a system of sets closed under the operations of 
taking unions, intersections, differences, and symmetric differences. 


Clearly, a ring of sets is also closed under the operations of taking finite 
unions and intersections: 


n n 
U A,, N A,. 
k=1 k=1 


A ring of sets must contain the empty set @, since A —-A= @. 
A set Eis called the unit of a system of sets Y if Ee SY and 


ANE=A 


for every Ae #. Clearly E is unique (why?). Thus the unit of is 
just the maximal set of , 1.e., the set containing all other sets of Y. 
A ring of sets with a unit is called an algebra (of sets). 


Example 1. Given a set A, the system -4(A) of all subsets of A is an 
algebra of sets, with unit E = A. 


Example 2. The system {@, A} consisting of the empty set @ and any 
nonempty set A is an algebra of sets, with E = A. 


Example 3. The system of all finite subsets of a given set A is a ring of 
sets. This ring is an algebra if and only if A itself is finite. 


Example 4. The system of all bounded subsets of the real line is a ring of 
sets, which does not contain a unit. 


18 The material in this section need not be read now, since it will not be needed until 
Chapter 7. 
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THEOREM 1. The intersection 

 R=N2s 

of any set of rings is itself a ring. 
Proof. An immediate consequence of Definition]. § 


THEOREM 2. Given any nonempty system of sets /, there is a unique 
ring P containing F and contained in every ring containing S. 


Proof. \f F exists, then clearly F is unique (why?). To prove the 
existence of F, consider the union 
X=UA 
ACY 
of all sets A belonging to Y and the ring .4(X) of all subsets of X. Let 
= be the set of all rings of sets contained in. @(X) and containing S. 
Then the intersection 
P=NZ 
REX: 
of all these rings clearly has the desired properties. In fact, A obviously 
contains Y. Moreover, if #* is any ring containing /, then the 
intersection Z = ZA* 1. M(X)isaringinXand henceP co Ac F*, 
as required. The ring F is called the minimal ring generated by the system 
SF, and will henceforth be denoted by ZY). | 


Remark. The set @(X) containing #(S) has been introduced to avoid 
talking about the “set of all rings containing %.’? Such concepts as “‘the 
set of all sets,’’ “the set of all rings,’ etc. are inherently contradictory and 
should be avoided (recall Problem 10, p. 20). 


4.2. Semirings of sets. The following notion is more general than that 
of a ring of sets and plays an important role in a number of problems (par- 
ticularly in measure theory): 


DEFINITION 2. A system of sets F is called a semiring (of sets) if 


1) S contains the empty set 2; 

2) AN BES whenever ACES, BEF; 

3) If S# contains the sets A and A, < A, then A can be represented 
as a finite union 


nr 


A=UA, (1) 


k=1 


of pairwise disjoint sets of S, with the given set A, as its first term. 
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Remark. The representation (1) is called a finite expansion of A, with 
respect to the sets A,, Ay,..., Ap. 


Example 1. Every ring of sets & is a semiring, since if #@ contains A and 
A, < A, then A = A, U A, where A, = A— AVE Z. 


Example 2. The set / of all open intervals (a, b), closed intervals [a, 5] 
and half-open intervals [a, b), (a, b], including the “empty interval’’ (a, a) = 
@ and the single-element sets [a, a] = {a}, is a semiring but not a ring. 


LEMMA 1. Suppose the sets A, Ay, ..., Ay, where Ay, ..., A, are 
pairwise disjoint subsets of A, all belong to a semiring S. Then there is a 
finite expansion 

k=1 
with Ay,..., A, as its first n terms, where A, € S, A, OA, = © forall 
k,l/=1,...,n. 


Proof. The lemma holds for n = 1, by the definition of a semiring. 
Suppose the lemma holds for n = m, and consider m + 1 sets A;,..., 
Am, Am 1 satisfying the conditions of the lemma. By hypothesis, 

ASA, OAg BUS HUB, 
where the sets A,,...,A,,, B,,...,B, are pairwise disjoint subsets of 
A, all belonging to Y. Let 


B 


ot = Amir O B,. 
By the definition of a semiring, 
B, = By U+++ UB, 
where the sets B,, (= 1,...,7,) are pairwise disjoint subsets of B,, 
all belonging to Y. But then it is easy to see that 


Pp Tq 
A= Ay U0 U Ag U Ang UU Bus). 
q=1 \j=2 


i.e., the lemma is true form = m — 1. The proof now follows by mathe- 
matical induction. § 


LEMMA 2. Given any finite system of sets A,,..., A, belonging toa 


semiring S, there is a finite system of pairwise disjoint sets B,,..., B, 
belonging to S such that every A, has a finite expansion 


A.= UB, (k=1,...,n) 


sé A}, 


with respect to certain of the sets B,.*® 


16 Here M,, denotes some subset of the set {1, 2,..., 3}, depending on the choice of k, 
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Proof. The lemma is trivial for n = 1, since we need only set t = 1, 
B, = A, Suppose the lemma is true for n = m, and consider a system 
of sets A1,...,Am, Amsy in S. Let By... , B, be sets of Y satisfying 
the conditions of the lemma with respect to A;,...,A,,, and let 
By = A mit CY B,. 


Then, by Lemma 1, there is an expansion 


t q 
Amst = (U Bu) U (U B;) (BLES), 
s=1 p=1 
while, by the very definition of a semiring, there is an expansion 
B, = By U By, U's: UB, (By; € SF). 
It is easy to see that 
A= (Uz.,) os ee) 
seM,;, \j=1 


for some suitable M,. Moreover, the sets B,,, B, are pairwise disjoint. 
Hence the sets B,,, B) satisfy the conditions of the lemma with respect 
to Ay,...,Am; Amii- The proof now follows by mathematical induc- 
tion. § 


4.3. The ring generated by a semiring. According to Theorem 1, there is 
a unique minimal ring &(S) generated by a given system of sets . The 
actual construction of #(.S) is quite complicated for arbitrary Y. However, 
the construction is completely straightforward if is a semiring, as shown 
by 
THEOREM 3. If S is a semiring, then B(S) coincides with the system 
& of all sets A which have finite expansions 


n 
A=UA, 
k=1 
with respect to the sets A, € F. 
Proof. First we prove that Z@ is aring. Let A and B be any two sets in 
2%. Then there are expansions 


B=UB, (BeE%”). 


Since ¥ is a semiring, the sets 


Cis = A, C-) B; 
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also belong to Y. By Lemma 1, there are expansions 


nT Ti 
Ag= (U Cu) U (U Da] (Dix € SF), 
j=l k=l 
m 83 (2) 
By = (U Ci.) U (U Es) (E;,€ SF). 
i=1 1=1 
It follows from (2) that 4 © Band A A B have the expansions 
A CY‘ B — U Ci55 


sisal ic nd a 


and hence belong to 2. Therefore & is a ring. The fact that 2 is the 
minimal ring generated by / is obvious. 


4.4, Borel algebras. There are many problems (particularly in measure 
theory) involving unions and intersections not only of a finite number of 
sets, but also of a countable number of sets. This motivates the following 


concepts: 


DEFINITION 3. A ring of sets is called a o-ring if it contains the union 


wo 
S=UA, 
n=1 
whenever it contains the sets A,, Ay,...,An,.... A o-ring with a unit 


E is called a o-algebra. 


DEFINITION 4. A ring of sets is called a 8-ring if it contains the inter- 
section 


io @) 
D=f)A, 
n=1 
whenever it contains the sets Ay, Ag,...,Ay,... . A 8-ring with a unit 


E is called a §-algebra. 
THEOREM 4. Every o-algebra is a 8-algebra and conversely. 
Proof. An immediate consequence of the ““dual’’ formulas 
NA,=E-U(E—A,). 3 


The term Borel algebra (or briefly, B-algebra) is often used to denote 
a o-algebra (equivalently, a 5-algebra). The simplest example of a B-algebra 
is the set of all subsets of a given set A. 
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Given any system of sets Y, there always exists at least one B-algebra 
containing .Y. In fact, let 
X=UA. 


AES 


Then the system @ of all subsets of X is clearly a B-algebra containing S. 

If @ is any B-algebra containing / and if E is its unit, then every 
A € & is contained in E and hence 

X=UACE. 
ACS 

A B-algebra @ is called irreducible (with respect to the system ) if X = E, 
i.e., an irreducible B-algebra is a B-algebra containing no points that do 
not belong to one of the sets A € ¥. In every case, it will be enough to 
consider only irreducible B-algebras. 


Theorem 2 has the following analogue for irreducible B-algebras: 


THEOREM 5. Given any nonempty system of sets S, there is a unique 
irreducible” B-algebra B(S) containing S and contained in every 
B-algebra containing S. 


Proof. The proof is virtually identical with that of Theorem 2. The 
B-algebra B(S) is called the minimal B-algebra generated by the system 
SY or the Borel closure of S. | 


Remark. An important role is played in analysis by Borel sets or B-sets. 
These are the subsets of the real line belonging to the minimal B-algebra 
generated by the set of all closed intervals [a, 5]. 


Problem 1. Let X be an uncountable set, and let & be the ring consisting 
of all finite subsets of X and their complements. Is # a o-ring? 


Problem 2. Are open intervals Borel sets? 


Problem 3. Let y = f(x) be a function defined on a set M and taking 
values in a set N. Let.@ be a system of subsets of M, and let f(.#) denote 
the system of all images f(A) of sets 4 €.@. Moreover, let / be a system 
of subsets of N, and let f-1(/) denote the system of all preimages f~'(B) 
of sets Be WV. Prove that 

a) If WY is a ring, so is f7*(%); 

b) If / is an algebra, so is f1(V); 

c) If WY is a B-algebra, so is f-"(V); 

d) AfMUNM)) = fA); 

ce) BYSVUN)) =f (AY). 

Which of these assertions remain true if VY is replaced by.@ and f— by f? 


17 More exactly, irreducible with respect to #. 
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METRIC SPACES 


5. Basic Concepts 


5.1. Definitions and examples. One of the most important operations in 
mathematical analysis is the taking of limits. Here what matters is not so 
much the algebraic nature of the real numbers,’ but rather the fact that 
distance from one point to another on the real line (or in two or three- 
dimensional space) is well-defined and has certain properties. Roughly 
speaking, a metric space is a set equipped with a distance (or “metric’’) 
which has these same properties. More exactly, we have 


DEFINITION 1. By @ metric space is meant a pair (X, pe) consisting of 
a set X and a distance 9, i.e., a single-valued, nonnegative, real function 
e(x, y) defined for all x, y © X which has the following three properties: 


1) p(x, y) = 0 ifand only ifx = y; 
2) Symmetry: e(x, y) = eV, x); 
3) Triangle inequality: e(x,z) < p(x, y) + p(y, Z). 


We will often refer to the set X as a “‘space’’ and its elements x, y,... as 
“points.” Metric spaces are usually denoted by a single letter, like 


R= (X, p), 


or even by the same letter X as used for the underlying space, in cases where 
there is no possibility of confusion. 


1T.e., the fact that the real numbers form a field. 
37 
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Example 1. Setting 

0 if x=y, 

I if xy, 

where x and y are elements of an arbitrary set X, we obviously get a metric 
space, which might be called a “discrete space’ or a “‘space of isolated 
points.” 


(x,y) = 


Example 2. The set of all real numbers with distance 
p(x, y) = |x — | 
is a metric space, which we denote by Rt’. 
Example 3. The set of all ordered n-tuples 
Ns (is ogee 4005) 


of real numbers x,, Xo,...,X,, With distance 


ox, y= | pxc — y,)*, (1) 


is a metric space denoted by R” and called n-dimensional Euclidean space 
(or simply Euclidean n-space). The distance (1) obviously has properties 
1) and 2) in Definition 1. Moreover, it is easy to see that (1) satisfies the 
triangle inequality. In fact, let 
x = (X1, Xe; eae g Nin)s y = (i> Jes ose $y), i= (21, Zope 5% Zi) 
be three points in R”, and let 
Ay = Xn —DVer On = Ve — Zn (kK = 1,...,n). 

Then the triangle aes aa! takes the form 


pxc — 2%)" < [Btn — oe yz)" + [30-20 Zn) ’ (2) 


or qi: 


[3G +O) Sa} Da [3 by. (2) 
k=1 k=1 k=1 
It follows from the Cauchy-Schwarz inequality 
n 2 n n 
( > abr] < daz Yb} (3) 
k=1 k=1 k=1 
(see Problem 2) that 
pC + b,) = ~ a} F 2> ayby +303 


Taking square roots, we get (2’) and hence (2). 
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Example 4. Take the same set of ordered n-tuples x = (x,,..., X,) as in 
the preceding example, but this time define the distance by the function 


pul, 9) = DI — Yel (4) 


It is clear that (4) has all three properties of a distance figuring in Definition 
1. The corresponding metric space will be denoted by R?. 


Example 5. Take the same set as in Examples 3 and 4, but this time 
define distance between two points x = (x,,...,,) and y = (y,..- 5 Yn) 
by the formula 


po(x, y) = max |x, — Yyl- (5) 
1<kSn 
Then we again get a metric space (verify all three properties of the distance). 
This space, denoted by R*, is often as useful as the Euclidean space R”. 


Remark. The last three examples show that it is sometimes important 
to use a different notation for a metric space than for the underlying set of 
points in the space, since the latter can be “‘metrized”’ in a variety of different 
ways. 


Example 6. The set C,,,, of all continuous functions defined on the 
closed interval [a, b], with distance 


of, 2) = pues If() — g(t)| (6) 


is a metric space of great importance in analysis (again verify the three 
properties of distance). This metric space and the underlying set of “points’”’ 
will both be denoted by the symbol C,,,,. Instead of C,,,,, we will often 
write just C. A space like C,,,, is often called a “function space,” to 
emphasize that its elements are functions. 


Example 7. Let-/, be the set of all infinite sequences? 
Ke (Ns Mas oe he ees) 


of real numbers x1, x2,..., X,,-.. satisfying the convergence condition 
. 2 
>. X% < 0, 
k=1 


2The infinite sequence with general term x, can be written as {x,} or simply as 
X1,X%o,-.-,Xz,-.. (this notation is familiar from calculus). It can also be written in 
“point notation” as x = (%1, X2,--., Xk, -- +), 1€., aS an “ordered co-tuple” generalizing 
the notion of an ordered x-tuple. (In writing {x,} we have another use of curly brackets, 
but the context will always prevent any confusion between the sequence {x,} and the set 
whose only element is x...) 
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where distance between points is defined by 


p(x, y) = '| 2% Py: (7) 
Clearly (7) makes sense for all x, y €/,, since it follows from the elementary 
inequality 

(x, + Ve)” < xe + Vi) 
that convergence of the two series 


~ 2 a 2 
> Xie > Ve 
° e e >t k=1 
implies that of the series 
o 
Dr — y,)°. 
k=1 
At the same time, we find that if the points (x,, X2,...,X,,...) and 


(V1, Yoo+++»Vx»+--) both belong to /,, then so does the point 
(x, + Vi, Xe + Vo, eee yg Xp + Ves o ay. 


The function (7) obviously has the first two defining properties of a distance. 
To verify the triangle inequality, which takes the form 


xen — Z,)" < Ses — yz)" + [So — 2)" (8) 


for the metric (7), we first note that all three series converge, for the reason 
just given. Moreover, the inequality 


[3c = Z4) < [3c — y,)° + [30% = Zz)" (9) 


holds for all n, as shown in Example 3. Taking the limit as n — oo in (9), 
we get (8), thereby verifying the triangle inequality in /,. Therefore /, is a 
metric space. 


Example 8. As in Example 6, consider the set of all functions continuous 
on the interval [a, 5], but this time define distance by the formula 


ax, 9) = (fc — or ae) (10) 


instead of (6). The resulting metric space will be denoted by C7, ,,. The 
first two properties of the metric are obvious, and the fact that (10) satisfies 
the triangle inequality is an immediate consequence of Schwarz’s inequality 


(xo at) < Peo dt{y*(2) dt (11) 
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(see Problem 3), by the continuous analogue of the argument given in 
Example 3. 


Example 9. Next consider the set of all bounded infinite sequences of real 
numbers x = (%1,%9,...,X,,-..), and let? 


ox, y= Sup lx, — Vel- (12) 


This gives a metric space which we denote by m. The fact that (12) has the 
three properties of a metric is almost obvious. 


Example 10. As in Example 3, consider the set of all ordered n-tuples 
x = (x1,...,%,) of real numbers, but this time define the distance by the 


more general formula 
n 


poo = (Se np) (13) 


where p is a fixed number >1 (Examples 3 and 4 correspond to the cases 
p = 2 and p = 1, respectively). This gives a metric space, which we denote 
by R%. Itis obvious that p,(x, y) = Oifand only if x = y and that p,(x, y) = 
e,(y, x), but verification of the triangle inequality for the metric (13) requires 
a little work. Let 
X= (01,662 5Xn)s VE Yrr-- Vas Z= Bre + +s Zq) 
be three points in R%, and let 
By = Xe — Yas On = Vu — Ze (A= 1,...,0n), 
just as in Example 3. Then the triangle inequality 
e,(x,Z) < e,(x, z) + p,(y, 2) 
takes the form of Minkowski’s inequality 


n 1/p n 1/p nm 1/p 
(Sle. + oP) < (2/2«") ‘ (>>!) | (14) 
k=l k=1 aes 


The inequality is obvious for p = 1, and hence we can confine ourselves to 
the case p > 1. 
The proof of (14) for p > 1 is in turn based on Hélder’s inequality 


n n I/p [ ® 1/¢ 
S label < (Slate) (Sal) (19) 
k=1 k=1 k=1 
where the numbers p > 1 and g > 1 satisfy the condition 
ria (16) 


* The least upper bound or supremum of a sequence of real numbers aj, @2,...5 Qn ++ 
is denoted by sup a,. 
k 
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We begin by observing that the inequality (15) is homogeneous, i.e., if it 
holds for two points (a@,,...,a@,) and (b,,...,6,), then it holds for any 
two points (Aq,,..., Ad,) and (ub,,..., ub,) where \ and yu are arbitrary 
real numbers. Therefore we need only prove (15) for the case 


lad? = >1b1?= 1. (17) 
k=1 k=1 
Thus, assuming that (17) holds, we now 
i prove that 
b > |a,b;| < 1. (18) 
k=1 
Consider the two areas S, and S, shown in 
Figure 8, associated with the curve in the &y- 
plane defined by the equation 
; g n= GP, 
Hieuee ® or equivalently by the equation 
6 = fF". 
Then clearly 


a= Jera=$, Sa Jytdn= 7. 
Moreover, it is apparent from the figure that 
S, + S, > ab 
for arbitrary positive a and b. It follows that 
aa, (19) 
Pp q 


Setting a = |a,|, b = [b,|, summing over k from 1| ton, and taking account 
of (16) and (17), we get the desired inequality (18). This proves Hélder’s 
inequality (15). Note that (15) reduces to Schwarz’s inequality if p = 2. 

It is now an easy matter to prove Minkowski’s inequality (14), starting 
from the identity 


(lal + |6|)? = (lal + 11)? lal + (lal + 15))?-* 1a. 


In fact, setting a = a,, b = b, and summing over k from 1 to n, we obtain 
> (lai + [b,)”? = X(ax| + [bul lag| + > Cael + |bxl)?™ Ldul- 
k=1 k=1 k=1 


Next we apply Hélder’s inequality (15) to both sums on the right, bearing 
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in mind that (p — l)q = p: 
n n 1/a@ n 1/p n 1/p 
Slae + 16D? < (Scal+ 16a") (| Sta? ] + | Stour] ”) 
k=1 k=1 hal 4 


Dividing both sides of this inequality by 


n l/ag 
(> ({a,| + ul)” ’ 
k= 
we get 
n 1/p n 1/p n 1/p 
(2(laxl + tod") < (Slat) + (lo) 
k=] k=1 k=1 
which immediately implies (14), thereby proving the triangle inequality in R%. 
Example 11. Finally let /, be the set of all infinite sequences 
es Oi esd hn.g se) 
of real numbers satisfying the convergence condition 


co 
Sx < © 
k=1 


for some fixed number p > 1, where distance between points is defined by 
oo 1/p 
o(x 9) = (Ive — nl?) (20) 


(the case p = 2 has already been considered in Example 7). It follows from 
Minkowski’s inequality (14) that 


1/p 1/p 


n l/p n n 
(SI = vl < (I) .o (Si") (21) 
— k=1 k=1 
for any n. Since the series 
Dil, Slyel” 
k=1 k=1 
converge, by hypothesis, we can take the limit as n — oo in (21), obtaining 
00 1/p oo 1/p 00 ljp 
(is - vl) < ( hI") + ( 54") < ©. 
k=1 k=1 k=1 
This shows that (20) actually makes sense for arbitrary x, ye/,. At the same 
time, we have verified that the triangle inequality holds in /, (the other two 


properties of a metric are obviously satisfied). Therefore /, is a metric space. 


Remark. \f R = (X, e) is a metric space and M is any subset of X, then 
obviously R* = (M, oe) is again a metric space, called a subspace of the 
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original metric space R. This device gives us infinitely more examples of 
metric spaces. 


5,2. Continuous mappings and homeomorphisms. Isometric spaces. Let f 
be a mapping of one metric space X into another metric space Y, so that 
f associates an element y = f(x) € Y with each element x € X. Then /f is 
said to be continuous at the point x,€ X if, given any < > 0, there exists a 
5 > 0 such that 

of (2)sf 0) << 
whenever 
p(x, X%) <8 


(here p is the metric in X and p’ the metric in Y). The mapping f is said 
to be continuous on X if it is continuous at every point x € X. 


Remark. This definition reduces to the usual definition of continuity 
familiar from calculus if X and Y are both numerical sets, i.e., if fis a real 
function defined on some subset of the real line. 


Given two metric spaces X¥ and Y, let f be one-to-one mapping of X onto 
Y, and suppose f and f—! are both continuous. Then f is called a homeo- 
morphic mapping, or simply a homeomorphism (between X and Y). Two 
spaces X and Y are said to be homeomorphic if there exists a homeomorphism 
between them. 


Example. The function 


y=f{HN= 2 atc tan x 
TT 


establishes a homeomorphism between the whole real line (— 00, oo) and the 
open interval (—1, 1). 


DEFINITION 2. A one-to-one mapping f of one metric space R = (X, ¢) 
onto another metric space R' = (Y, p’) is said to be an isometric mapping 
(or simply an isometry) if 


O(X1, X2) = eo (f(%1), f(%2)) 


for all x,;, X,€R. Correspondingly, the spaces R and R’ are said to be 
isometric (to each other). 


Thus if R and R’ are isometric, the ‘‘metric relations’? between the 
elements of R are the same as those between the elements of R’, i.e., R and 
R’ differ only in the explicit nature of their elements (this distinction is 
unimportant from the standpoint of metric space theory). From now on, 
we will not distinguish between isometric spaces, regarding them simply as 
identical. 
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Remark, We will discuss continuity and homeomorphisms from a more 
general point of view in Sec. 9.6. 


Problem 1. Given a metric space (X, p), prove that 


a) |p(x,z)— pV. WI < p(x, y) + p(zZ,u) = (x, y, z, ue X); 
b) |e(x, z) — ey. 2) < p(x, y) (x, y,zEX). 


Problem 2. Verify that 
n 2 n n 1” 2 
k=l k=l x=l 2i=1 j=1 
Deduce the Cauchy-Schwarz inequality (3) from this identity. 
Problem 3. Verify that 


( Px a) Pw at) yz) dt — ; (i i: "[x(s)y(t) — y(s)x()]? ds dt. 


Deduce Schwarz’s inequality (11) from this identity. 
Problem 4. What goes wrong in Example 10, p. 41 if p< 1? 
Hint. Show that Minkowski’s inequality fails for p < 1. 


Problem 5. Prove that the metric (5) is the limiting case of the metric (13) 
in the sense that 


n 1/p 
Po(x, y) = max |x; — y,| = lim (2 rae nl) . 
1k Dp @ k=1 


STN 


Problem 6. Starting from the inequality (19), deduce Hdélder’s integral 
inequality 


[xo ar< (fora) (Lior}"  (£+2=1), 


valid for any functions x(t) and y(t) such that the integrals on the right exist. 
Problem 7. Use Hélder’s integral inequality to prove Minkowski’s integral 
inequality 
b 1/p b 1/p b 1/p 
(x + or ar)" < (fixor) + (Powray” — @> v. 
and C 


Problem 8. Exhibit an isometry between the spaces C op 


[0,1] 


6. Convergence. Open and Closed Sets 


6.1. Closure of a set. Limit points. By the open sphere (or open ball) 
S(%9, r) in a metric space R we mean the set of points x € R satisfying the 
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inequality 
o(Xy, x) <r 


(p is the metric of R).4 The*fixed point x is called the center of the sphere, 
and the number r is called its radius. By the closed sphere (or closed ball) 
S[xo, 7] with center x) and radius r we mean the set of points x € R satisfying 
the inequality 

e(Xo, x) <r. 


An open sphere of radius e with center x, will also be called an e-neighborhood 
of x), denoted by O,(X»). 

A point x € R is called a contact point of a set M © R if every neighbor- 
hood of x contains at least one point of M. The set of all contact points of a 
set M is denoted by [M] and is called the closure of M. Obviously M < [M], 
since every point of M is a contact point of M. By the closure operator in 
a metric space R, we mean the mapping of R into R carrying each set M ¢ R 
into its closure [M]. 


THEOREM 1. The closure operator has the following properties: 


1) If M < N, then [M] ¢ [N]; 
2) [[44]] = [4]; 

3) [MU N]= [M] V [N]; 

4) [S] =. 


Proof. Property 1) is obvious. To prove property 2), let x € [[M4]]. 
Then any given neighborhood O,(x) contains a point x, € [M]. Consider 
the sphere O, (x;) of radius 


fj = & — p(x, x). 


Clearly O,,(%1) is contained in O,(x). In fact, if z€O,(%1), then 
e(z, x1) < ¢, and hence, since p(x, x;) = © — &, it follows from the 
triangle inequality that 


ep(zZ,x)<eyt+(e—gy)—s, 


i.e., z€O,(x). Since x;¢[M], there is a point x,¢ M in O, (x). But 
then x, € O,(x) and hence x € [M], since O,(x) is an arbitrary neighbor- 
hood of x. Therefore [[M4]} ¢ [M/]. But obviously [M4] < [[M]] and 
hence [[14]] = [M], as required. 

To prove property 3), let x €¢ [M U N] and suppose x ¢ [M] U [N]. 
Then x ¢ [M] and x ¢ [N]. But then there exist neighborhoods O,,(x) 
and O,,(x) such that O, (x) contains no points of M while O,,(x) contains 


4 Any confusion between ‘‘sphere”’ meant in the sense of spherical surface and “‘sphere”’ 
meant in the sense of a solid sphere (or ball) will always be avoided by judicious use of the 
adjectives “‘open”’ or “‘closed.” 
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no points of N. It follows that the neighborhood O,(x), where « = 
min {€,, €2}, contains no points of either M or N, and hence no points 
of M UN, contrary to the assumption that x ¢ [M UN]. Therefore 
x €[M] U [N], and hence 


[MUN] ¢ [M] VU [NI], (1) 


since x is an arbitrary point of [M U N]. On the other hand, since 
McCoMUN and NC MUN, it follows from property 1) that 
[M]< [M U Nj and [N] ¢ [M U JN]. But then 


[M] UV [IN] ¢ [WUN], 


which together with (1) implies [MZ U N] = [M] VU [VN]. 
Finally, to prove property 4), we observe that given any M © R, 


[M] = [MU 2] = [M] Vv [92], 


by property 3). It follows that [@] < [M]. But this is possible for 
arbitrary M only if [@] = @. (Alternatively, the set with no elements 
can have no contact points!) J 


A point x € Ris called a limit point of a set M © Rif every neighborhood 
of x contains infinitely many points of M. The limit point may or may not 
belong to M. For example, if M is the set of rational numbers in the interval 
[0, 1], then every point of [0, 1], rational or not, is a limit point of M. 

A point x belonging to a set M is called an isolated point of M if there 
is a (“sufficiently small’’) neighborhood of x containing no points of M other 
than ~ itself. 


6.2. Convergence and limits. A sequence of points {x,} = x1, %2,..., 
X,,»--- im a metric space R is said to converge to a point x ER if every 
neighborhood O,(x) of x contains all points x,, starting from a certain index 
(more exactly, if, given any <¢ > 0, there is an integer N, such that O,(x) 
contains all points x, with n > N,). The point x is called the /imit of the 
sequence {x,,}, and we write x, — x (as n — oo). Clearly, {x,} converges to 
x if and only if 

lim e(x, x,) = 0. 


n> © 
It is an immediate consequence of the definition of a limit that 


1) No sequence can have two distinct limits; 


2) If a sequence {x,} converges to a point x, then so does every subse- 
quence of {x,} 


(give the details). 
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THEOREM 2. A necessary and sufficient condition for a point x to bea 
contact point of a set M is that there exist a sequence {x,} of points of M 
converging to x. 


Proof. The condition is necessary, since if x is a contact point of M, 
then every neighborhood O,,,(x) contains at least one point x, € M, 
and these points form a sequence {x,} converging to M. The sufficiency 
is obvious. | 


THEOREM 2’. A necessary and sufficient condition for a point x to be a 
limit point of a set M is that there exist a sequence {x,,} of distinct points 
of M converging to x. 


Proof. Clearly, if x is a limit point of M, then the points x, € 
Oy;n(x) OC M figuring in the proof of Theorem 2 can be chosen to be 
distinct. This proves the necessity, and the sufficiency is again obvious. § 


6.3. Dense subsets. Separable spaces. Let A and B be two subsets of a 
metric space R. Then 4 is said to be dense in B if [A] > B. In particular, 
A is said to be everywhere dense (in R) if [A] = R. A set A is said to be 
nowhere dense if it is dense in no (open) sphere at all. 


Example 1. The set of all rational points is dense in the real line R’. 


Example 2. The set of all points x = (x1, X2,...,%X,) with rational co- 
ordinates is dense in each of the spaces R”, R? and R® introduced in Examples 
3-5, pp. 38-39. 


Example 3. The set of all points x = (%, %2,...,X%,,...) with only 
finitely many nonzero coordinates, each a rational number, is dense in the 
space /, introduced in Example 7, p. 39. 


Example 4. The set of all polynomials with rational coefficients is dense 
in both spaces C,, ,, and Cy, ,, introduced in Examples 6 and 8, pp. 39 and 
40. 


DEFINITION. A metric space is said to be separable if it has a countable 
everywhere dense subset. 


Example 5. The spaces R’, R", RG, Ri, 1, Cg. yy, and Cf, », are all separable, 
since the sets in Examples 1-4 above are all countable. 


Example 6. The “discrete space” M described in Example 1, p. 38 con- 
tains a countable everywhere dense subset and hence is separable if and only 
if it is itself a countable set, since clearly [MZ] = ™ in this case. 


Example 7. There is no countable everywhere dense set in the space m of 
all bounded sequences, introduced in Example 9, p. 41. In fact, consider 
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the set E of all sequences consisting exclusively of zeros and ones. Clearly, 
E has the power of the continuum (recall Theorem 6, Sec. 2.5), since there 
is a one-to-one correspondence between E and the set of all subsets of the 
set Z, = {1,2,...,n,...} (describe the correspondence). According to 
formula (12), p. 41, the distance between any two points of E equals 1. 
Suppose we surround each point of E by an open sphere of radius 4, thereby 
obtaining an uncountably infinite family of pairwise disjoint spheres. Then 
if some set M is everywhere dense in m, there must be at least one point of 
M in each of the spheres. It follows that M cannot be countable and hence 
that m cannot be separable. 


6.4. Closed sets. We say that a subset M of a metric space R is closed if it 
coincides with its own closure, i.e., if [M4] = M. In other words, a set is 
called closed if it contains all its limit points (see Problem 2). 


Example 1. The empty set @ and the whole space R are closed sets. 

Example 2. Every closed interval [a, 5] on the real line is a closed set. 

Example 3. Every closed sphere in a metric space is a closed set. In 
particular, the set of all functions fin the space C,, ,, such that |f(t)| < K 
(where K is a constant) is closed. 

Example 4. The set of all functions fin C,, ,, such that | f(¢)| < K (an 


open sphere) is not closed. The closure of this set is the closed sphere in the 
preceding example. 


Example 5. Any set consisting of a finite number of points is closed. 


THEOREM 3. The intersection of an arbitrary number of closed sets is 
closed. The union of a finite number of closed sets is closed. 


Proof. Given arbitrary sets F, indexed by a parameter a, let x be a 
limit point of the intersection 


F=f) F,. 


Then any neighborhood O,(x) contains infinitely many points of F, and 
hence infinitely many points of each F,. Therefore x is a limit point of 
each F, and hence belongs to each F,, since the sets F, are all closed. 
It follows that x € F, and hence that F itself is closed. 

Next let 


be the union of a finite number of closed sets F,, and suppose x does 
not belong to F. Then x does not belong to any of the sets F;,, and hence 


50 METRIC SPACES CHAP, 2 


cannot be a limit point of any of them. But then, for every k, there is a 
neighborhood O,,(x) containing no more than a finite number of points 
of F,. Choosing ae 

s = min {g,,..., En}, 


we get a neighborhood O,(x) containing no more than a finite number of 
points of F, so that x cannot be a limit point of F. This proves that a 
point x ¢ F cannot be a limit point of F. Therefore Fis closed. § 


6.5. Open sets. A point x is called an interior point of a set M if x hasa 
neighborhood O,(x) < M, i.e., a neighborhood consisting entirely of points 
of M. A set is said to be open if its points are all interior points. 


Example 1. Every open interval (a, 6) on the real line is an open set. In 
fact, if a< x <b, choose « = min {x — a, b — x}. Then clearly O,(x) < 
(a, b). 

Example 2. Every open sphere S(a, r) in a metric space is an open set. 
In fact, x € S(a, r) implies p(a, x) < r. Hence, choosing « = r — (a, x), we 
have O,(x) = S(x, 5) © S(a,r). 


Example 3. Let M be the set of all functions fin C,,,, such that f(t) < 
g(t), where g is a fixed function in C,, ,;. Then M is an open subset of C,, ,). 


THEOREM 4. A subset M of a metric space R is open if and only if its 
complement R — M is closed. 


Proof. If M is open, then every point x € M has a neighborhood 
(entirely) contained in M. Therefore no point x € M can be a contact 
point of R — M. In other words, if x is a contact point of R— M, 
then x € R — M,i.e., R — M is closed. 

Conversely, if R — M is closed, then any point x € M must have a 
neighborhood contained in M, since otherwise every neighborhood of x 
would contain points of R — M, i.e., x would be a contact point of 
R— M notin R — M. Therefore Mis open. J 


CorROLLARY. The empty set @ and the whole space R are open sets. 

Proof. An immediate consequence of Theorem 4 and Example 1, 
Sec. 6.4. ff 

THEOREM 5. The union of an arbitrary number of open sets is open. The 
intersection of a finite number of open sets is open. 


Proof. This is the “dual’’ of Theorem 3. The proof is an immediate 
consequence of Theorem 4 and formulas (3)-(4), p. 4. 
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6.6. Open and closed sets on the real line. The structure of open and closed 
sets in a given metric space can be quite complicated. This is true even for 
open and closed sets in a Euclidean space of two or more dimensions 
(R",n > 2). In the one-dimensional case, however, it 1s an easy matter to 
give a complete description of all open sets (and hence of all closed sets): 


THEOREM 6. Every open set G on the real line is the union of a finite or 
countable system of pairwise disjoint open intervals.° 


Proof. Let x be an arbitrary point of G. By the definition of an open 
set, there is at least one open interval containing x and contained in G. 
Let J, be the union of all such open intervals. Then, as we now show, I, 
is itself an open interval. In fact; let® 


a = inf I,, b= sup l, 
(where we allow the cases a = — oo and b = +00). Then obviously 
I, < (a, 8). (2) 


Moreover, suppose y is an arbitrary point of (a, 6) distinct from x, 
where, to be explicit, we assume that a < y < x. Then there is a point 
y €1, such that a < y’ < y (why?). Hence G contains an open interval 
containing the points y’ and x. But then this interval also contains y, 
i.e., yET,. (The case y > x is treated similarly.) Moreover, the point 
x belongs to J,, by hypothesis. It follows that I, > (a, 6), and hence by 
(2) that I,, = (a, b). Thus I, is itself an open interval, as asserted, in fact 
the open interval (a, d). 

By its very construction, the interval (@, 5) is contained in G and is 
not a subset of a larger interval contained in G. Moreover, it is clear 
that two intervals J, and I,, corresponding to distinct points x and x’ 
either coincide or else are disjoint (otherwise I, and I. would both be 
contained in a larger interval I, U I, = I © G. There are no more than 
countably many such pairwise disjoint intervals f,. In fact, choosing an 
arbitrary rational point in each I,, we establish a one-to-one correspond- 
ence between the intervals [, and a subset of the rational numbers. 
Finally, it is obvious that 


c=Uy,. 


COROLLARY. Every closed set on the real line can be obtained by deleting 
a finite or countable system of pairwise disjoint intervals from the line. 


5 The infinite intervals (— 0, 00), (a, 00), and (— ©, 5) are regarded as open. 
® Given a set of real numbers E, inf E denotes the greatest lower bound or infimum 
of E, while sup E denotes the least upper bound or supremum of E. 
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Proof. An immediate consequence of Theorems 4 and 6. Jj 


Example 1. Every closed interval [a, b] is a closed set (here a and b are 
necessarily finite). | 


Example 2. Every single-element set {x9} is closed. 


Example 3. The union of a finite number of closed intervals and single- 
element sets is a closed set. 


Example 4 (The Cantor set). A more interesting example of a closed set 
on the line can be constructed as follows: Delete the open interval (4, %) 
from the closed interval F, = [0, 1], and let F, denote the remaining closed 
set, consisting of two closed intervals. Then delete the open intervals 

3, %) and (4%, $) from F,, and let F, denote the remaining closed set, con- 
sisting of four closed intervals. Then delete the “‘middle third’’ from each 
of these four intervals, getting a new closed set F3, and so on (see Figure 9). 
Continuing this process indefinitely, we get a sequence of closed sets F,, such 
that 

Pie Py dad he HF ee 


(such a sequence is said to be decreasing). The intersection 


00 
F=f\F, 

n=0 
of all these sets is called the Cantor set. Clearly F is closed, by Theorem 3, 
and is obtained from the unit interval [0, 1] by deleting a countable number 
of open intervals. In fact, at the nth stage of the construction, we delete 
2"-} intervals, each of length 1/3”. 

To describe the structure of the set F, we first note that F contains the 

points 


0, ] 2 4, $, or $ iv 3 = 9-2-2 (3) 
i.e., the end points of the deleted intervals (together with the points 0 and 1). 
Qt eg 
0 3 7 
3 3 f 
2 2 7 8 
0 3 § 3 3.3 9 P 
oe ee = 


FIGURE 9 
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However F contains many other points. In fact, given any x € [0, 1], suppose 
we write x in ternary notation, representing x as a series 


area) ae a ae (4) 


where each of the numbers aj, a,,...,a,,...can only take one of the three 
values 0, 1, 2. Then it is easy to see that x belongs to F if and only if x has a 
representation (4) such that none of the numbers @,, a,,...,@,,... equals 
1 (think things through).’ 

Remarkably enough, the set F has the power of the continuum, Le., 
there are as many points in F as in the whole interval [0, 1], despite the fact 
that the sum of the lengths of the deleted intervals equals 


b+etah toca. 
To see this, we associate a new point 


ae 
pi 
with each point (4), where® 


bn 


be... | 
a ea 


J if a,= 


In this way, we set up a one-to-one correspondence between F and the whole 
interval [0, 1]. It follows that F has the power of the continuum, as asserted. 
Let A, be the set of points (3). Then F = A, U A,, where theset A, = F — A, 
is uncountable, since A, is countable and F itself is not. The points of A, 
are often called “points (of F) of the first kind,’ while those of A, are called 
“points of the second kind.” 


Problem 1. Give an example of a metric space R and two open spheres 
S(x, r,) and S(y, re) in R such that S(x,7r,) © SQV, re) although r, > ro. 


Problem 2. Prove that every contact point of a set M is either a limit point 
of M or an isolated point of M. 


* Just as in the case of ordinary decimals, certain numbers can be written in two 
distinct ways. For example, 


41 0. ~=«20 0 0 2 2 2 


3 3B 3" Re! 3 yt 


| 
3 
Since none of the numerators in the second representation equals 1 the point 4 belongs 
to F (this is already obvious from the construction of F). 
8 If x has two representations of the form (4), then one and only one of them has no 
numerators @;, @2,...,@n,. . . equal to 1. These are the numbers used to define 5,. 
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Comment. In particular, [44] can only contain points of the following 
three types: - 

a) Limit points of M@ belonging to M; 

b) Limit points of M which do not belong to M; 

c) Isolated points of M. 


Thus [M] is the union of M and the set of all its limit points. 


Problem 3. Prove that if x, — x, y, > y as n — oo, then o(x,, y,) > 
e(x, y). 
Hint. Use Problem la, p. 45. 


Problem 4. Let f be a mapping of one metric space X into another metric 
space Y. Prove that fis continuous at a point x, if and only if the sequence 
{yn} = (f(x,)} converges to y = f(%) whenever the sequence {x,} con- 
verges to Xp. 


Problem 5. Prove that 


a) The closure of any set is a closed set; 
b) [MM] is the smallest closed set containing M. 


Problem 6. Is the union of infinitely many closed sets necessarily closed ? 
How about the intersection of infinitely many open sets? Give examples. 


Problem 7. Prove directly that the point 4 belongs to the Cantor set F, 
although it is not an end point of any of the open intervals deleted in con- 
structing F. 


Hint. The point 4 divides the interval [0,1] in the ratio 1:3. It also 
divides the interval [0, 4] left after deleting (4, %) in the ratio 3:1, and so on. 


Problem 8. Let F be the Cantor set. Prove that 


a) The points of the first kind, i.e., the points (3) form an everywhere 
dense subset of F; 
b) The numbers of the form t, + ¢,, where ¢,, t, € F, fill the whole interval 
[O, 2]. 
Problem 9. Given a metric space R, let A be a subset of R and x a point 
of R. Then the number 
e(A, x) = inf (a, x) 
acd 


is called the distance between A and x. Prove that 


a) x € A implies o(A, x) = 0, but not conversely; 

b) e(A, x) is a continuous function of x (for fixed A); 

c) e(A, x) = 0 if and only if x is a contact point of A; 

d) [A] = A U M, where M is the set of all points x such that e(A, x) = 0. 
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Problem 10. Let A and B be two subsets of a metric space R. Then the 
number 


o(A, B) = inf o(a, b) 
beB 


is called the distance between A and B. Show that 0(A, B) =O0ifA NBA, 
but not conversely. 


Problem 11. Let Mx be the set of all functions fin C,, ,, satisfying a 
Lipschitz condition, i.e., the set of all f such that 


If (4) —f(2)| < Kit, — 4 
for all ¢,, t, € [a, b], where K is a fixed positive number. Prove that 


a) M,, is closed and in fact is the closure of the set of all differentiable 
functions on [a, b] such that | f’(t)| < K; 
b) The set 


M=UM, 
K 


of all functions satisfying a Lipschitz condition for some K is not 
closed ; 
c) The closure of M is the whole space C,, ,,. 


Problem 12. An open set G in n-dimensional Euclidean space R” is said 
to be connected if any points x, y€G can be joined by a polygonal line® 
lying entirely in G. For example, the (open) disk x? + y? < 1 is connected, 
but not the union of the two disks 


rw+p<ci, (x — 2)? +y?< 1 


(even though they share a contact point). An open subset of an open set G 
is called a component of G if it is connected and is not contained in a larger 
connected subset of G. Use Zorn’s lemma to prove that every open set G in 
R” is the union of no more than countably many pairwise disjoint com- 
ponents. 


Comment. In the case n = 1 (i.e., on the real line) every connected open 
set is an open interval, possibility one of the infinite intervals (— 00, 00), 
(a, ©), (—oo, b). Thus Theorem 6 on the structure of open sets on the line 
is tantamount to two assertions: 


1) Every open set on the line is the union of a finite or countable number 
of components; 
2) Every open connected set on the line is an open interval. 


* By a polygonal line we mean a curve obtained by joining a finite number of straight 
line segments end to end. 
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The first assertion holds for open sets in R” (and in fact is susceptible to 
further generalizations), while the second assertion pertains specifically to 
the real line. 


7. Complete Metric Spaces 


7.1. Definitions and examples. The reader is presumably already familiar 
with the notion of the completeness of the real line. The real line is, of course, 
a particularly simple example of a metric space. We now make the natural 
generalization of the notion of completeness to the case of an arbitrary 
metric space. 


DEFINITION 1. A sequence {x,} of points in-a metric space R with metric 
e is said to satisfy the Cauchy criterion if, given any « > 0, there is an 
integer N, such that e(X,, X,) < ¢ for alln, n' > N,. 


DEFINITION 2. A subsequence {x,,} of points ina metric space R is called 
a Cauchy sequence (or a fundamental sequence) if it satisfies the Cauchy 
criterion. 


THEOREM 1. Every convergent sequence {x,,} is fundamental. 


Proof. If {x,} converges to a limit x, then, given any < > 0, there is 
an integer N, such that 


(Xm ¥) <5 
for alln > N,. But then 
P(X» Xn) < P(Xn» X) + (Xn xX) << 
for alln,n'’>N,. | 


DEFINITION 3. A metric space R is said to be complete if every Cauchy 
sequence in R converges to an element of R. Otherwise R is said to be 
incomplete. 


Example 1. Let R be the “space of isolated points” considered in Example 
1, p.38. Then the Cauchy sequences in R are just the “stationary sequences,” 
i.e., the sequences {x,} all of whose terms are the same starting from some 
index n. Every such sequence is obviously convergent to an element of R. 
Hence R is complete. 


Example 2. The completeness of the real line R‘ is familiar from elemen- 
tary analysis. 
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Example 3. The completeness of Euclidean n-space R” follows from that 
of R}. In fact, let 
x?) = (x 0.x) (p=1,2,...) 


be a fundamental sequence of points of R”. Then, given any < > 0, there 
exists an N, such that 


n 
DO — xh <e? 


for all p,q > N,. it follows that 
[x — xi ee (kK =1,.-.,n) 


for all p,q > N,, ie., each {x\”"} is a fundamental sequence in R'. Let 


KiKi Ss a); 
where 
X_ = lim x}. 
° pro 
Then obviously 
lim x’? = x, 


pro 
This proves the completeness of R”. The completeness of the spaces Rj and 
R? introduced in Examples 4 and 5, p. 39 is proved in almost the same way 
(give the details). 


Example 4. Let {x,(¢)} be a Cauchy sequence in the function space C,, ,, 
considered in Example 6, p. 39. Then, given any « > 0, there is an N, such 


that 
|x,(t) — x,(t)| < (1) 


for all n, n’ > N, and all t € [a,b]. It follows that the sequence {x,,(t)} is 
uniformly convergent. But the limit of a uniformly convergent sequence of 
continuous functions is itself a continuous function (see Problem 1). Taking 
the limit as n’ — oo in (1), we find that 


|x, (2) —~ x(t)| <€ 
for all n > N, and all ¢ € [a, 5], 1.e., {x,,(4)} converges in the metric of C,, ,, 
to a function x(t) €C,, ,;. Hence C,, ,, is a complete metric space. 


Example 5. Next let x™ be a sequence in the space /, considered in 
Example 7, p. 39, so that 


(nm) (n)  \(n) (n) 
2 ea SX Ay is oh Ge Oe) 


mY SO. e152, ois), 
k= 
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Suppose further that {x'™} is a Cauchy sequence. Then, given any « > 0, 
there is a N, such that 


o'(x'™ x) = D(x" an aE ar (2) 
if n,n’ > N,. It follows that 
(xl — x Pee (k=1,2,...), 


i.e., for every k the sequence {xim} is fundamental and hence convergent. 


Let 


= lm x 


n> © 


R21 has sk ees) 


Then, as we now show, x is itself a point of /, and moreover {x} converges 
to x in the /, metric, so that /, is a complete metric space. 
In fact, (2) implies 


M 
On? — x)? <e (3) 
k=1 

for any fixed M. Holding zn fixed in (3) and taking the limit as n" — oo, we get 
se — x)" <e. (4) 


Since (4) holds for arbitrary M, we can in turn take the limit of (4) as M — oo, 
obtaining 


See —x)*<e. (5) 


Just as on p. 40, the convergence of the two series 


26) bxcr (n) __ X;)" 
implies that of the series 
oO 
x, 
k=1 


This proves that x €/,. Moreover, since « is arbitrarily small, (5) implies 


lim o(x, x) = lim [Sia — 30 — x,)° = 0, 


n> © nao 


i.e., {x} converges to x in the /, metric, as asserted. 
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Example 6. It is easy to show that the space CF ,, of Example 8, p. 40 is 
incomplete. If 


n 


then {¢,,(t)} is a fundamental sequence in Chana since 


Zz 


IP [p,(t) — o,(t)) dt << ————~ 
min {n, n’} 


However, {¢,(¢)} cannot converge to a function in Chaap In fact, consider 
the discontinuous function 


w-| 


—| if ¢<0, 
1 if +> 0. 


Then, given any function fe Cea it follows from Schwarz’s inequality 
(obviously still valid for piecewise continuous functions) that 


1 2 1/2 1 2 1/2 1 ; 1/2 
[ro - oO) < (fio 9(oP at) + (Pt9,09 — oR at) . 
But the integral on the left is nonzero, by the continuity of /, and moreover 
it is clear that 
lim J" Len(t) — $OF de = 0. 
Therefore 


1 
[uM — eOP at 
cannot converge to zero as n — 00. 


7.2. The nested sphere theorem. A sequence of closed spheres 
iS [05575155 [Nes Polsacas SIAL 
in a metric space R is said to be nested (or decreasing) if 
S[x1, 71] > S[X2, fe] > +++ > S[X,, 7.) > °° 


Using this concept, we can prove a simple criterion for the completeness of R: 
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THEOREM 2 (Nested sphere theorem). A metric space R is complete if 
and only if every nested sequence {S,} = {S[Xn; n)} of closed spheres in 
R such that r,, > 0 as n-> © has a nonempty intersection 


N S,.- 
n=l 
Proof. If R is complete and if {S,} = {S[x,, r,]} is any nested se- 
quence of closed spheres in R such that r,-—>0 as n— oo, then the 
sequence {x,,} of centers of the spheres is fundamental, since e(x,, x,-) < 
r, forn’ >nandr,—0asn— oo. Therefore {x,} has a limit. Let 


x = lim x,,. 
naa 
Then 
os) 
xefs,. 
n=] 


In fact, S,, contains every point of the sequence {x,} except possibly the 
points x1, X2,...,X,_1, and hence x is a limit point of every sphere S,,. 
But S,, is closed, and hence x € S,, for all n. 

Conversely, suppose every nested sequence of closed spheres in R 
with radii converging to zero has a nonempty intersection, and let {x,} 
be any fundamental sequence in R. Then x has a limit in R. To see this, 
use the fact that {x,} is fundamental to choose a term x,, of the sequence 
{x,} such that 


J 
P(Xn, Xn) < 5) 


for alln > ny, and let S, be the closed sphere of radius 1 with center x,, . 
Then choose a term x,,, of {x,} such that n, > n, and 


(Xn Xn.) < s 
2 
for alln > nz, and let S, be the closed sphere of radius 3 with center ae 
Continue this construction indefinitely, i.e., once having chosen terms 
Xny> Xngr+ ++ 9 Xn, (my << g< +++ << n,), choose a term x,,, such that 
Nyv1 > nN, and 


1 
OAXn» Xnysr) < K+I 


for alln > n,,,, let S,,, be the closed sphere of radius 1/2* with center 
Xn,,,7 and so on. This gives a nested sequence {S,} of closed spheres 
with radii converging to zero. By hypothesis, these spheres have a non- 
empty intersection, i.e., there is a point x in all the spheres. This point 
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is obviously the limit of the sequence {x, }. But if a fundamental se- 
quence contains a subsequence converging to x, then the sequence itself 
must converge to x (why ?), L.e., 
limx, =x. ff 
N7>O 
7.3. Baire’s theorem. It will be recalled from Sec. 6.3 that a subset A of a 
metric space R is said to be nowhere dense in R if it is dense in no (open) 
sphere at all, or equivalently, if every sphere S < R contains another sphere 
S’ such that S’ 0 A = @ (check the equivalence). This concept plays an 
important role in 


THEOREM 3 (Baire). A complete metric space R cannot be represented 
as the union of a countable number of nowhere dense sets. 


Proof. Suppose to the contrary that 


R=UA4,, (6) 
n=1 

where every set A,, is nowhere dense in R. Let Sp) © R be aclosed sphere 
of radius 1. Since A, is nowhere dense in Sy, being nowhere dense in R, 
there is a closed sphere S, of radius less than 4 such that S, < S) and 
S, © A, = @. Since A, is nowhere dense in S,, being nowhere dense 
in So, there is a closed sphere S, of radius less than } such that S, < S, 
and S, © A; = @, and so on. In this way, we get a nested sequence of 
closed spheres {S,,} with radii converging to zero such that 


S,OA,= @ (a= 1,2,...). 


By the nested sphere theorem, the intersection 


contains a point x. By construction, x cannot belong to any of the 
sets A,,, 1.€., 


It follows that 
RAUA,, 


n=1 


contrary to (6). Hence the representation (6) is impossible. 


COROLLARY. A complete metric space R without isolated points is 
uncountable. 


Proof. Every single-element set {x} is nowhere dense in R. J 
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7.4, Completion of a metric space. As we now show, an incomplete metric 
space can always be enlarged (in an essentially unique way) to give a complete 
metric space. 


DEFINITION 4. Given a metric space R with closure [R], a complete 
metric space R* is called a completion of R if R < R* and [R] = R*, 
l.e., if R is a subset of R* everywhere dense in R*. 


Example 1. Clearly R* = R if Ris already complete (see Problem 7). 


Example 2. The space of all real numbers is the completion of the space 
of all rational numbers. 


THEOREM 4. Every metric space R has a completion. This completion 
is unique to within an isometric mapping carrying every point x € R into 
itself. 


Proof. The proof is somewhat lengthy, but completely straight- 
forward. First we prove the uniqueness, showing that if R* and R** 
are two completions of R, then there is a one-to-one mapping x** = 
o(x*) of R* onto R** such that (x) = x for all x € R and 


pi(x*, y*) = po(x**, y**) (7) 


(y** = o(y*)), where 9, is the distance in R* and p, the distance in R**. 
The required mapping 9 is constructed as follows: Let x* be an arbitrary 
point of R*. Then, by the definition of a completion, there is a sequence 
{x,} of points of R converging to x*. The points of the sequence {x,} 
also belong to R**, where they form a fundamental sequence (why ?). 
Therefore {x,} converges to a point x** € R**, since R** is complete. 
It is clear that x** is independent of the choice of the sequence {x,} 
converging to the point x* (why?). If we set o(x*) = x**, then 9 is 
the required mapping. In fact, (x) = x for all x € R, since if x, > x 
€ R, then obviously x = x* € R*,x** = x. Moreover, suppose x, > x*, 
Yn y* in R*, while x,— x**, y,—> y** in R**. Then, if oe is the 
distance in R, 


e,(x*, y* = lim P(X, Yn) = lim (Xn Vn) (8) 
(see Problem 3, p. 54), while at the same time 
eo(x**, y**) = lim PolXn, Vn) = lim (Xn Yn) (8’) 


Nn FO 


But (8) and (8’) together imply (7). 
We must now prove the existence of a completion of R. Given an 
arbitrary metric space R, we say that two Cauchy sequences {x,} and 
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{X,} in R are equivalent and write {x,} ~ {%,} if 
lim p(x, ¥,) = 9. 

As anticipated by the notation and terminology, ~ is reflexive, sym- 
metric and transitive, i.e., ~ is an equivalence relation in the sense of 
Sec. 1.4. Therefore the set of all Cauchy sequences of points in the space 
R can be partitioned into classes of equivalent sequences. Let these 
classes be the points of a new space R*. Then we define the distance 
between two arbitrary points x*, y* € R* by the formula 

e1(x*, y*) = lim e(x,, Yn)» (9) 
where {x,,} is any “representative” of x* (namely, any Cauchy sequence 
in the class x*) and {y,} is any representative of y*. 

The next step is to verify that (9) is in fact a distance, i.e., that (9) 
exists, is independent of the choice of the sequences {x,} € x*, {y,} € y*, 
and satisfies the three properties of a distance figuring in Definition 1, 
p: 37. Given any ¢ > 0, it follows from the triangle inequality in R 
(recall Problem 1b, p. 45) that 


le(Xns Vn) a (Xn; Yn) 
= |0(%ns Vn) — O(Xnr Vn) + P(Xns Yn) — (Xn, Yn)| 


< |o(X p> Vn) — (x, Val a |o(X ns Yn) _ OXn', Yard 
< Xn, Xn’) =f Vn Vn') a ; = : = Ss (10) 


for all sufficiently large n and n’. Therefore the sequence of real numbers 
{s,} = {9(X,, ¥,)} is fundamental and hence has a limit. This limit is 
independent of the choice {x,} €x*, {y,} €y*. In fact, suppose 


Shona CX ss - Wally ey 
Then 
LO Vns Va) — Pas Val < 0 Cas Xn) Ons Fa): 


by a calculation analogous to (10). But 


no 


no 
since {x,} ~ {X,b, Wat ~ {Pn}, and hence 
lim e(X,, Yn) = lim p(X, ¥,,). 
nw ND 


As for the three properties of a metric, it is obvious that p,(x*, y*) = 
0,(y*, x*), and the fact that o,(x*, y*) = 0 if and only if x* = y* is an 
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immediate consequence of the definition of equivalent Cauchy sequences. 
To verify the triangle inequality in R*, we start from the triangle inequality 


0(Xns Zn) < (Xns Yn) + PV ns Zn) 


in the original space R and then take the limit as n — oo, obtaining 


lim (Xn, Zn) < lim e(x,, Vn) + 1iM (Vn, Zn) 
no nO NAO 


ei(x*, 2*) < pi(x*, y*) + ei Q*, 2%). 


We now come to the crucial step of showing that R* is a completion 
of R. Suppose that with every point x € R, we associate the class x* € R* 
of all Cauchy sequences converging to x. Let 


i.€., 


x =limx,, y =limy,,. 
n> no 


Then clearly 
e(x, y) = lim e(Xn» Ya) 


(recall Problem 3, p. 54), while on the other hand 


pr(x™, y*) = Lim 0X5 Yn)» 
by definition. Therefore 
p(x, y) a p,(x*, y*), 


and hence the mapping of R into R* carrying x into x* is isometric. 
Accordingly, we need no longer distinguish between the original space R 
and its image in R*, in particular between the two metrics p and 9, 
(recall the relevant comments on p. 44). In other words, R can be re- 
garded as a subset of R*. The theorem will be proved once we succeed 
in showing that 


1) R is everywhere dense in R*, 1.e., [R] = R; 
2) R* is complete. 


To this end, given any point x* € R* and any ¢ > 0, choose a rep- 
resentative of x*, namely a Cauchy sequence {x,} in the class x*. Let 
N be such that o(x,, x,,) < efor all n,n’ > N. Then 


Xn» x*) = lim O(Xns Xn) qe 
n’—- oO 


ifn > N, ie., every neighborhood of the point x* contains a point of R. 
It follows that [R] = R. 

Finally, to show that R* is complete, we first note that by the very 
definition of R*, any Cauchy sequence {x,} consisting of points in R 
converges to some point in R*, namely to the point x* € R* defined by 
{x,,}. Moreover, since R is dense in R*, given any Cauchy sequence 
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{xt} consisting of points in R*, we can find an equivalent sequence {x,} 
consisting of points in R. In fact, we need only choose x, to be any 
point of R such that o(x,, x2) < 1/n. The resulting sequence {x,} is 
fundamental, and, as just shown, converges to a point x* € R*. But then 
the sequence {x*} also converges to x*. § 


Example. If R is the space of all rational numbers, then R* is the space of 
all real numbers, both equipped with the distance o(x, y) = |x — y|. In this 
way, we can “construct the real number system.’’ However, there still 
remains the problem of suitably defining sums and products of real numbers 
and verifying that the usual axioms of arithmetic are satisfied. 


Problem 1. Prove that the limit f(t) of a uniformly convergent sequence 
of functions { f,(¢)} continuous on [a, b] is itself a function continuous on 
[a, 5). 


Hint. Clearly 
If) — fod] < If) — LAO] + Fn) — fate)! + I Fnlto) — FoI; 


where ¢, fy € [a, b]. Use the uniform convergence to make the sum of the 
first and third terms on the right small for sufficiently large n. Then use the 
continuity of f,(t) to make the second term small for ¢ sufficiently close to fp. 


Problem 2. Prove that the space m in Example 9, p. 41 is complete. 


(e8) 
Problem 3. Prove that if R is complete, then the intersection () S,, 
figuring in Theorem 2 consists of a single point. n=1 


Problem 4. By the diameter of a subset A of a metric space R is meant the 
number 


d(A) = sup e(x, y). 
x.vEeAd 


Suppose R is complete, and let {A,,} be a sequence of closed subsets of R 
nested in the sense that 
Ava A, StS ALS fs 
Suppose further that 
lim d(A,,) = 0. 


r>O 


[8] 
Prove that the intersection (] 4, 1s nonempty. 
n=l 
Problem 5. A subset A of a metric space R is said to be bounded if its 
diameter d(A) is finite. Prove that the union of a finite number of bounded 
sets is bounded. 
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Problem 6. Give an example of a complete metric space R and a nested 
sequence {A,} of closed subsets of R such that 


NA, = o. 
n=l 


Reconcile this example with Problem 4. 


Problem 7. Prove that a subspace of a complete metric space R is com- 
plete if and only if it is closed. 


Problem 8. Prove that the real line equipped with the distance 
o(x, y) = jarc tan x — arc tan y| 
is an incomplete metric space. 


Problem 9. Give an example of a complete metric space homeomorphic 
to an incomplete metric space. 


Hint. Consider the example on p. 44. 


Comment. Thus homeomorphic metric spaces can have different “‘metric 
properties.” 


Problem 10. Carry out the program discussed in the last sentence of the 
example on p. 65. 


Hint. If {x,} and {y,} are Cauchy sequences of rational numbers serving 
as “‘representatives’’ of real numbers x* and y*, respectively, define x* + y* 
as the real number with representative {x, + yy}. 


8. Contraction Mappings 


8.1. Definition of a contraction mapping. The fixed point theorem. Let 4 
be a mapping of a metric space R into itself. Then x is called a fixed point 
of A if Ax = x, 1.e., if A carries x into itself. Suppose there exists a number 
a < 1 such that 

p(Ax, Ay) < ap(x, y) (1) 


for every pair of points x, y€ R. Then A is said to be a contraction mapping. 
Every contraction mapping is automatically continuous, since it follows from 
the “‘contraction condition” (1) that Ax,,—» Ax whenever x, — x. 


THEOREM | (Fixed point theorem). Every contraction mapping A 
defined on a complete metric space R has a unique fixed point. 


10 Often called the method of successive approximations (see the remark following 
Theorem 1) or the principle of contraction mappings. 
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Proof. Given an arbitrary point x) R, let’! 
Ny SAX. he Ad SH es RSH ANG HA ies 1) 


Then the sequence {x,} is fundamental. In fact, assuming to be explicit 
that n < n’, we have 


(Xn, Xn’) i= e(A”Xo, A” Xo) < a.” (Xo, Nie) 


< a.” [o(X9, xy) sa (x1, X2) ae cia ar OX picts Neha) 
24 n’—n— n I 
< a"o(xy, x) [lL tater’ zs:-+a <a p(%0: #1) 
= 


But the expression on the right can be made arbitrarily small for suffi- 
ciently large n, since a < 1. Since R is complete, the sequence {x,}, 
being fundamental, has a limit 


x = lim x,. 
nO 


Then, by the continuity of A, 


Ax = A lim x, =lim Ax, == lim x,4, = x. 
This proves the existence ofa fixed point x. To prove the uniqueness of x, 
we note that if 
Ax =X, Ay=y, 
(1) becomes 
p(x, y) < a(x, y). 
But then o(x, y) = Osincea<1,andhencex=y. | 


Remark. The fixed point theorem can be used to prove existence and 
uniqueness theorems for solutions of equations of various types. Besides 
showing that an equation of the form Ax = x has a unique solution, the 
fixed point theorem also gives a practical method for finding the solution, i.e., 
calculation of the “‘successive approximations” (2). In fact, as shown in 
the proof, the approximations (2) actually converge to the solution of the 
equation Ax = x. For this reason, the fixed point theorem is often called 
the method of successive approximations. 


Example 1. Let f be a function defined on the closed interval [a, b] which 
which maps [a, 5] into itself and satisfies a Lipschitz condition 


If (x) — f(%2)| < K |x, — x], (3) 


with constant K <1. Then f is a contraction mapping, and hence, by 


11 42x means A(Ax), A*?x means A(A?x) = A?(Ax), and so on. 
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OQ X Ask % X 


FIGuRE 10 


Theorem 1, the sequence 


XQ> x1 = f(X); Xp =f (%),... (4) 


converges to the unique root of the equation f(x) = x. In particular, the 
“contraction condition’’ (3) holds if fhas a continuous derivative f’ on [a, 5] 
such that 

If’ Ol < K <1. 


The behavior of the successive approximations (4) in the cases 0 < f’(x) < I 
and —1 < f’(x) < 0 is shown in Figures 10 and 11. 


Example 2. Consider the mapping A of n-dimensional space into itself 
given by the system of linear equations 


(5) 


FIGURE 11 
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If A is a contraction mapping, we can use the method of successive approxi- 
mations to solve the equation Ax = x. The conditions under which A is a 
contraction mapping depend on the choice of metric. We now examine three 
cases: 


1) The space R® with metric 


e(x, y) = max |x, — yl. 
1Si<n 


In this case, 


e(y, ¥) = max |y; — J,| = max 


a 


> A(X; — X;) 
2 


< max > |a,,| Ix; — X,| 
a ? 


< max 2 la;,| 7 Ix, — X,| = (max 2 2 o(x, X), 
and the contraction condition 
dlaal<e<t (i=1,...,7). (6) 
2) The space Rf with metric 


ex, y) = Xb — il. 
Here 


(YY) = 2 ly. -— Kil = p » a,(X; — X,) 


< > > lassl x3 — ¥,| < (max > au p(x, x), 
t 2 ) a 
and the contraction condition is now 
> lail<a<1 (j=1,...,n). (7) 
2 


3) Ordinary Euclidean space R® with metric 


(x,y) =, > (x, — yp? 


Using the Cauchy-Schwarz inequality, we have 


“Y, N=> (= Aix; — #))< (> 2 as) e°(x, X), 


t 


and the contraction condition becomes 


Xd ai << a<. (8) 
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Thus, if at least one of the conditions (6)—(8) holds, there exists a unique 
point x = (X41, X%2,..., X,) such that 


nor 
Xj = % G4j%s + b; G= | ane ., ih). (9) 
j= 
The sequence of successive approximations to this solution of the equation 
x = Ax are of the form 


(0) __ (0) ,.(0) (0) 
xX — (x, 5 XN» 52° 8 @ 9 Xn ), 
(1) __ (1) (1) (1) 
Xx peor (x1 9 Xs 9° 2 © 4 Xn ), 
(kK) _ fy) VC) (x) 


ee e e @ 8 @ © @® @ e@© @ ee @ @ @ @ 


where 
(k) ~ (k—1) 
k k-l1 
x= » Gx; ae b, 
j=1 


and we can choose any point x" as the “zeroth approximation.” 

Each of the conditions (6)-(8) is sufficient for applicability of the method 
of successive approximations, but none of them is necessary. In fact, examples 
can be constructed in which each of the conditions (6)—(8) is satisfied, but 
not the other two. 


Theorem | has the following useful generalization, which will be needed 
later (see Example 2, p. 75): 


THEOREM |’. Given a continuous mapping of a complete metric space R 
into itself, suppose A” is a contraction mapping (n an integer > 1). Then 
A has a unique fixed point. 

Proof. Choosing any point xy € R, let 

x =lim A""xy. 


k7o@ 


Then, by the continuity of A, 


Ax = lim AA*" xp. 


kk? Oo 
But A” is a contraction mapping, and hence 
o( AP" Ax5, A®"X) <= ap( AY Axy, Arn) <= te = &p(AX;.Xp) 
where « < 1. It follows that 
0( Ax, x) = lim e( AA**xy, A*"x5) = 0, 


i.e., Ax = x so that x isa fixed point of A. To prove the uniqueness of x, 
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we merely note that if A has more than one fixed point, then so does A”, 
which is impossible, by Theorem 1, since A” is a contraction 


mapping. Jj 


8.2. Contraction mappings and differential equations. The most interesting 
applications of Theorems 1 and 1’ arise when the space R is a function 
space. We can then use these theorems to prove a number of existence and 
uniqueness theorems for differential and integral equations, as shown in this 
section and the next. 


THEOREM 2 (Picard). Given a function f(x, y) defined and continuous 
on a plane domain G containing the point (Xo, Yo),!” suppose f satisfies a 
Lipschitz condition of the form 


f(x y) — f(x, I< My — JI 


in the variable y. Then there is an interval |x — X9| < 8 in which the 
differential equation 


d 
= fo) (10) 


has a unique solution 
y= 9(x) 


P(Xo) = Jo (11) 


Proof. Together the differential equation (10) and the initial condition 
(11) are equivalent to the integral equation 


satisfying the initial condition 


ox) =yo+ |. f(t, 9) at. (12) 
By the continuity of f, we have 
If yIl<K (13) 


in some domain G’ < G containing the point (Xp, yo).12 Choose 5 > 0 
such that 
1) (x, y) EG’ if |x — x9] < 8, 1ly — yol < KS; 
2) Ms <1, 
and let C* be the space of continuous functions ¢ defined on the interval 
12 By an n-dimensional domain we mean an open connected set in Euclidean n-space 


R" (connectedness is defined in Problem 12, p. 55). 
13 In fact, fis bounded on [G’] if [G’] < G (cf. Theorem 2, p. 110). 
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|x — Xo| < Sand such that |@(x) — yo| < K8, equipped with the metric 
0%) = max |9(x) — G(X). 


The space C* is complete, since it is a closed subspace of the space of all 
continuous functions on [x, — 8, x9 + 8]. Consider the mapping | = 
Ag defined by the integral equation 


Y= +] Me edt (x —xl <8) 


Clearly A is a contraction mapping carrying C* into itself. In fact, if 
ep EC*, |x — xo| < 5 then 


IY(x) — Yol = fe f(t, ed) a < [re (t))| dt < K |x —Xx| < KS 


by (13), and hence ) = Aq also belongs to C*. Moreover, 
We) — FOI < J LF CO) — F(t, GO)] at < MB max |) — GO) 
and hence 

o(t, }) < Mdo(¢, $) 


after maximizing with respect to x. But MS < 1, so that A is a con- 
traction mapping. It follows from Theorem | that the equation 9 = Ag, 
1.e., the integral equation (12), has a unique solution in the space C*. § 


Theorem 2 can easily be generalized to the case of systems of differential 
equations: 


THEOREM 2’. Given n functions f(x, V1, ..- 5 ¥,) defined and continuous 
on an (n + 1)-dimensional domain G containing the point 


(Xo, Your - + + » Yon)s 
suppose each f, satisfies a Lipschitz condition of the form 


| Fil, Visors > Vn) — filx, Vp ae nJ)| q M max [Yi - y,| 


1Sign 
in the variables y,,..., Yn. Then there is an interval |x — X9| < Sin which 
the system of differential equations 
dy; 
oY = F(x, Vays Vn) «6 CEA...) (14) 


dx 


has a unique solution 


Vr = G(X), - 6 Vn = Onl) 
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satisfying the initial conditions 


1 (Xo) = Yous - ++ > PnlXo) = Von (15) 


Proof. Together the differential equations (14) and the initial con- 
ditions (15) are equivalent to the system of integral equations 


& 
PAX) = Yor + J, fiat Gi(t),--+5Pn())dt (i= 1,...,n). (16) 
By the continuity of the functions f;, we have 


AO Vie VIS K G=1,...,n) (17) 


in some domain G’ © G containing the point (Xo, Voi; - - - » Yon)» Choose 
5 > 0 such that 


1) (X, Vis-- +s Vn) EG if |x — xo] < 8, ly; — Yo,| < KS for all i= 
| eer ie 


2) Ms <1. 
This time let C* be the space of ordered n-tuples 


P= (91s-++ > Pn) 
of continuous functions 9,,..., 9, defined on the interval |x — x,| < 8 
such that |9,;(x) — yo;| < KS for alli =1,...,”, equipped with the 
metric 


e(p, ¢) = max |o(x) — ox). 


Clearly C* is complete. Moreover, the mapping ) = A¢ defined by the 
system of integral equations 


Ax) = Yoi = [AG g(t), ae) Q(t 


(Ix — x| <S,i=—1,... 


is a contraction mapping carrying C* into ” 


P= (Pir-- +> Pr) EC* A 
then : , a \ ; mS é 
Ibi) — Yor = | J, felts xl, ca? a os 
by (17), so that J = (4). ao 8 wr 
WC) — Gool={ aS cai of (18) by extending the 
AS 
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and hence - 
e(}, ¥) < MS0(¢, 9) 


after maximizing with réspect to x and i. But MSs < 1, so that A is a 
contraction mapping. It follows from Theorem 1 that the equation 
o = Ag, iL.e., the system of integral equations (16), has a unique solution 
in the space C*. J 


8.3. Contraction mappings and integral equations. We now show how the 
method of successive approximations can be used to prove the existence and 
uniqueness of solutions of integral equations. 


Example 1. By a Fredholm equation (of the second kind) is meant an 
integral equation of the form 


f@) = AJ Ke SO) dy + 009, (18) 


involving two given functions K and , an unknown function f and an 
arbitrary parameter A. The function K is called the kernel of the equation, 
and the equation is said to be homogeneous if » =0 (but otherwise non- 
homogeneous). 

Suppose K(x, y) and (x) are continuous on the squarea<x< Bb, 
a<y<_}b,so that in particular 


Kix, ypl<M (a<cx<ba<cy< by). 


Consider the mapping g = Af of the complete metric space C,, ,, into itself 


given by 


a,b] 


a(x) =a] K(x, y) f(y) dy + @(2). 
Clearly, if g, = Afi, g2 = Af, then 
P(81» 82) = max |ga(x) — ge(x)| < [A] M(b — a) max | fix) — fal) 
= |A| M(b — a)o( fi, fa), 


so that A is a contraction mapping if 


|r| = ay. (19) 


It follows from Theorem | that the integral equation (18) has a unique 
solution for any value of A satisfying (19). The successive approximations 
fofis--+sfn»--- to this solution are given by 


fxy=A] KO fri) dy +o) (n= 1,2,.., 
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where any function continuous on [a, 6] can be chosen as fy. Note that the 
method of successive approximations can be applied to the equation (18) 
only for sufficiently small ||. 


Example 2. Next consider the Volterra equation 


f(x) = af K(x, y) f(y) dy + 90%), (20) 


which differs from the Fredholm equation (18) by having the variable x 
rather than the fixed number b as the upper limit of integration."‘ It is easy 
to see that the method of successive approximations can be applied to the 
Volterra equation (20) for arbitrary 4, not just for sufficiently small |A| as 
in the case of the Fredholm equation (18). In fact, let A be the mapping 


of C,,,p] into itself defined by 


Af(x) =r] K(x, yO) dy + 90, 
and let fi, f2€C,, 4; Then 
Af(x) — Af = J. K(x, LEO) — ON] ay 


< AM(x — a) max | f,(x) — fo(x)I, 


where 
M = max|K(x, y)|. 
“vey 


It follows that 


JAY (x) — AYfG)| < AM max | fila) — fx0)1 JG — a) dx 


= 7M? ce - max A(x) — fa(x)I, 


and in general, 
IATA(x) — AAO < eM ZH" re “max 0) — fxd | 


nN n gee 
< MOO max |fs) — fl) 
which implies 


0( A”, A*fo) < A"M" bao 


e( fi, fo). 


+ Equation (20) can be regarded formally as a special case of (18) by extending the 
definition of the kernel, i.e., by setting 


K(x,y)=90 if p>x. 
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But, given any A, we can always choose n large enough to make 


(b — a)" 
n! 


oK™M* 
i.e., A” is a Contraction mapping for sufficiently large n. It follows from 


Theorem 1’ that the integral equation (20) has a unique solution for arbitrary i. 


Problem 1. Let A be a mapping of a metric space R into itself. Prove that 
the condition 


p(Ax, AY) < ey)  (X#Y) 
is insufficient for the existence of a fixed point of A. 


Problem 2, Let F(x) be a continuously differentiable function defined on 
the interval [a, b] such that F(a) < 0, F(b) > 0 and 


0< K,< F'(x)< K, (a<x< b). 
Use Theorem 1 to find the unique root of the equation F(x) = 0. 


Hint. Introduce the auxiliary function f(x) = x — AF(x), and choose A 
such that the theorem works for the equivalent equation f(x) = x. 


Problem 3. Devise a proof of the implicit function theorem based on the 
use of the fixed point theorem.» 


Problem 4. Prove that the method of successive approximations can be 
used to solve the system (9) if |a,;| << 1/n (for alliand/), but not if |a,,| = 1/n. 


Problem 5. Prove that the condition (6) is necessary for the mapping (5) 
to be a contraction mapping in the space Rj. 


Problem 6. Prove that any of the conditions (6)-(8) implies 


ay,—1 Qy2 Qin 
Qo1 Ag, — | Gon 40 
Ant ano ann — I 


Comment. Hence the fact that the system (5) has a unique solution (under 
Suitable conditions) follows from Cramer’s rule as well as from the fixed 
point theorem. 


15 See e.g., I. G. Petrovski, Ordinary Differential Equations (translated by R. A. Silver- 
man), Prentice-Hall, Inc., Englewood Cliffs, N.J. (1966), p. 47. 
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Problem 7. Consider the nonlinear integral equation 


fee) = af Kee, vs FO) dy + oC) (21) 
with continuous K and 9, where K satisfies a Lipschitz condition of the form 
|K(x, y3 21) — K(x, y3 22)| < Mz — 22) 
in its “functional”? argument. Prove that (21) has a unique solution for all 

1 
M(b — a) 


Write the successive approximations to this solution. 


JA < 


3 


TOPOLOGICAL SPACES 


9. Basic Concepts 


9.1. Definitions and examples. In our study of metric spaces, we defined 
a number of key ideas like contact point, limit point, closure of a set, etc. 
In each case, the definition rests on the notion of a neighborhood, or, what 
amounts to the same thing, the notion of an open set. These notions (neigh- 
borhood and open set) were in turn defined by using the metric (or distance) 
in the given space. However, instead of introducing a metric in a given set 
X, we can go about things differently, by specifying a system of open sets 
in X with suitable properties. This approach leads to the notion of a topo- 
logical space. Metric spaces are topological spaces of a rather special 
(although very important) kind. 


DEFINITION I. Given a set X, by a topology in X is meant a system « of 
subsets G < X, called open sets (relative to <), with the following two 
properties: 

1) The set X itself and the empty set @ belong to 7; 

2) Arbitrary (finite or infinite) unions (J G, and finite intersections 


(1) G, of open sets belong to x. 
k=1 


DEFINITION 2. By a topological space is meant a pair (X, 7), consisting 
of a set X and a topology « defined in X. 
Just as a metric space is a pair consisting of a set X and a metric defined in 


X, so a topological space is a pair consisting of a set X and a topology defined 
in X. Thus, to specify a topological space, we must specify both a set X and 
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a topology in X, i.e., we must indicate which subsets of X are to be regarded 
as “open (in X).”’ Clearly, we can equip one and the same set with various 
different topologies, thereby defining various different topological spaces. 
Nevertheless, we will usually denote a topological space, namely a pair (X, 7), 
by a single letter like 7. Just as in the case of a metric space R, the elements 
of a topological space T will be called the points of T. 

By the closed sets of a topological space T, we mean the complements 
T — Gof the open sets G of T. It follows from Definition 1 and the “duality 
principle’’ (see p. 4) that 


I’) The space T itself and the empty set @ are closed; 


2") Arbitrary (finite or infinite) intersections N F,, and finite unions U Fi, 
of closed sets of T are closed. i 


The natural way of introducing the concepts of neighborhood, contact 
point, limit point and closure of a set is now apparent: 


a) By a neighborhood of a point x in a topological space T is meant any 
open set G © T containing x; 

b) A point x € T is called a contact point of a set M < T if every neigh- 
borhood of x contains at least one point of M; 

c) A point x € T is called a limit point of a set M < Tif every neighbor- 
hood of x contains infinitely many points of M; 

d) The set of all contact points of a set M © T is called the closure of 
M, denoted by [M}. 


Example 1. According to Theorem 5, p. 50, the open sets in any metric 
space satisfy the two properties in Definition 1. Hence every metric space 
is a topological space as well. 


Example 2, Given any set T, suppose we regard every subset of T as open. 
Then T is a topological space (the properties in Definition | are obviously 
satisfied). In particular, every set MM < Tis both open and closed, and every 
set M < T coincides with its own closure. Note that the “discrete metric 
space” of Example 1, p. 38 has this trivial topology. 


Example 3. As another extreme case, consider an arbitrary set T equipped 
with a topology consisting of just two sets, the whole set T and the empty 
set @. Then T is a topological space, a kind of “space of coalesced points”’ 
(mainly of academic interest). Note that the closure of every nonempty set 
is the whole space 7. 


Example 4. Let T be the set {a, b}, consisting of just two points a and b, 
and let the open sets in T be 7 itself, the empty set and the single-element set 
{b}. Then the two properties in Definition 1 are satisfied, and T is a topo- 
logical space. The closed sets in this space are T itself, the empty set and the 
set {a}. Note that the closure of {b} is the whole space T. 
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9.2. Comparison of topologies. Let t, and t, be two topologies defined 
in the same set X.1 Then we say that the topology +, is stronger than the 
topology t, (or equivalently that t, is weaker than 7) if t, © 7, ie., if 
every set of the system 7, is a set of the system 7. 


THEOREM 1. The intersection t = N 7, of any set of topologies in X 
is itself a topology in X. 


Proof. Clearly N +, contains ¥ and @. Moreover, since every 7,, is 


closed (algebraically) under the operations of anne arbitrary unions and 
finite intersections, the same is true of f] t,. § 


ce 4 


CorOLiaRy. Let B be any system of subsets of a set X. Then there 
exists a minimal topology in X containing @, i.e., a topology ~(@) con- 
taining @ and contained in every topology containing ZB. 


Proof. A topology containing @ always exists, e.g., the topology 
in which every subset of X is open. The intersection of all topologies 
containing @ is the desired minimal topology t(#), often called the 
topology generated by the system &%. 


Let @ be a system of subsets of X and A a fixed subset of X. Then by 
the trace of the system # on the set A we mean the system #, consisting of 
all subsets of X of the form 4 1 B, BE&. It is easy to see that the trace 
(on A) of a topology + (defined in X) is a topology t, in A. (Such a topol- 
ogy is often called a relative topology.) In this sense, every subset A of a 
given topological space (X, +) generates a new topological space (A, 74), 
called a subspace of the original topological space (X, 7). 


9.3. Bases. Axioms of countability. As we have seen, defining a topology 
in a space T means specifying a system of open sets in 7. However, in many 
concrete problems, it is more convenient to specify, instead of all the open 
sets, some system of subsets which uniquely determines all the open sets. 
For example, in the case of a metric space we first introduced the notion of 
an open sphere (<-neighborhood) and then defined an open set G as a set such 
that every point x € G has a neighborhood O,(x) © G. In other words, the 
open sets in a metric space are precisely those which can be represented as 
finite or infinite unions of open spheres. In particular, the open sets on the 
real line are precisely those which can be represented as finite or countable 
unions of open intervals (recall Theorem 6, p. 51). These considerations 
suggest 


1 This gives two topological spaces 7, = (X, t,) and T, = (X, 7). 
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DEFINITION 3. A family G of open subsets of a topological space T is 
called a base for T if every open set in T can be represented as a union of 
sets in G, 


Example 1. The set of all open spheres (of all possible radii and with all 
possible centers) in a metric space R is a base for R. In particular, the set 
of all open intervals is a base on the real line. The set of all open intervals 
with rational end points is also a base on the line, since any open interval 
(and hence any open set on the line) can be represented as a union of such 
intervals. 


It is clear from the foregoing that a topology + can be defined in a set T 
by specifying a base Y in T. This topology 7 is just the system of sets which 
can be represented as unions of sets in Y. If this way of specifying a topology 
is to be of practical value, we must find requirements which, when imposed 
on a system Y of subsets of a given set JT, guarantee that the system + of all 
possible unions of sets in GY be a topology in T, i.e., that + have the two 
properties figuring in Definition 1: 


THEOREM 2. Given a set T, let G be a system of subsets G, < T with the 
following two properties: 


1) Every point x € T belongs to at least one G, € G; 
2) Ifx € G, CO Ga, then there isa G, € GY such thatx € G, © G, O Gs. 


Suppose the empty set @ and all sets representable as unions of sets G,, 
are designated as open. Then T is a topological space, and G is a base for T. 


Proof. It follows at once from the conditions of the theorem that the 
whole set J and the empty set @ are open sets, and that the union of any 
number of open sets is open. We must still show that the intersection of 
a finite number of open sets is open. It is enough to prove this for just 
two sets. Thus let 

A = U Ga, B — U Gg. 
Then 
A OB = U (Ga Ge). (1) 


By hypothesis, given any point x € G, M Gg, there is a G, € Y such that 
xEG, © G, A G3. Hence the set G, O G, is open, being the union of 
all G., contained in G, © Gg. But then (1) is also open. Therefore T is a 
topological space. The fact that Y is a base for T is clear from the way 
open sets in T are defined. [f 


The following theorem is a useful tool for deciding whether or not a 
given system of open sets is a base: 
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THEOREM 3. A system G of open sets G,, in a topological space T is a 
base for T if and only if, given any open set G © T and any point x EG, 
there is a set G, € G such that x EG, © G. 


Proof. If G is a base for T, then every open set G © Tis a union 


G=UG, 


of sets G,¢ GY. Therefore every point x €G is contained in some set 
G, © G. Conversely, given any open set G © T, suppose that for every 
point x € G there is a set G,(x) € Y such that x € G,(x) © G. Then 


G= U G(x); 
xeC! 


1e.,Gisaunion of setsing. Jf 


Example 2. It follows from Theorem 3 that the set of all open spheres 
with rational radii (and all possible centers) in a metric space R is a base for 
R (this is obvious anyway). In particular, as already noted in Example 1, 
the set of all open intervals with rational end points is a base for the real line. 


An important class of topological spaces consists of spaces with a countable 
base, i.e., spaces in which there is at least one base containing no more than 
countably many sets. Such a space is also said to satisfy the second axiom of 
countability. 


THEOREM 4. If a topological space T has a countable base, then T con- 
tains a countable everywhere dense subset, i.e., a countable set M <— T 
such that [M] = T. 


Proof. Let @ = {G,, Gz,...,G,,...} be a countable base for T, 
and choose a point x, in each G,. Then the set 


MVE Gy a5: oes yyy eck ah 


is countable. Moreover, M is everywhere dense in T, since otherwise 
the nonempty open set G = T — [M] would contain no points of M. 
But this is impossible, since G is a union of some of the sets G, in Y and 
G,, contains the pointx,¢« M. J 


For metric spaces, we can say even more: 


THEOREM 5. If a metric space R has a countable everywhere dense 
subset, then R has a countable base. 


Proof. Suppose R has a countable everywhere dense subset {x,, 
Xo,-.+,Xy;...$. Then, given any open set G © Rand any x €G, there 
is an open sphere S(x,,, 1/n) such that x € S(x,,, |/n) © G for suitable 
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positive integers m and n (why?). Hence the open spheres S(x,,, 1/n), 


where m and 7 range over all positive integers, form a countable base for 
R. 


Combining Theorems 4 and 5, we see that a metric space R has a countable 
base if and only if it has a countable everywhere dense subset. 


Example 3. Every separable metric space, i.e., every metric space with a 
countable everywhere dense subset, is a metric space satisfying the second 
axiom of countability. 


Example 4. The space m of all bounded sequences is not separable (recall 
Example 7, p. 48) and hence has no countable base. 


Remark. In general, Theorem 5 does not hold for arbitrary (nonmetric) 
topological spaces. In fact, examples can be given of topological spaces 
which have a countable everywhere dense subset but no countable base. Let us 
see how this might come about. Given any point x of a metric space R, there 
is a countable neighborhood base (or local base) at x, i.e., a countable system 
© of neighborhoods of x with the following property: Given any open set G 
containing x, there is a neighborhood O € @ such that O < G (cf. Theorem 
3).2 Suppose every point x of a topological space T has a countable neigh- 
borhood base. Then T is said to satisfy the first axiom of countability. 
However, this axiom need not be satisfied in an arbitrary topological space. 
Hence the argument used in the case of metric spaces to deduce the existence 
of a countable base from that of a countable everywhere dense subset does 
not carry over to the case of an arbitrary topological space. 


A system -@ of sets M, is called a cover (or covering) of a topological 
space T, and .@ is said to cover T, if 


T= U M,. 


A cover consisting of open (or closed) sets only is called an open (or closed) 
cover. If -# is a cover of a topological space T, then by a subcover of 4 
we mean any subset of 4 which also covers T. 


THEOREM 6. If T is a topological space with a countable base G, then 
every open cover © has a finite or countable subcover. 


Proof. Since © covers T, each point x € T belongs to some open set 
O, € 2. Moreover, since Y is a countable base for T, for each x E T 
there is a set G,(x) € Y such that x € G,(x) < O, (recall Theorem 3). 


7 For example, the set of open spheres S(x, 1/n) is a countable neighborhood base at 
any point x of a metric space R. 
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The collection of all sets G,,(x) selected in this way is finite or countable 
and covers JT. For each G,,(x) we now choose one of the sets O, containing 
G,,(%), thereby obtaining:a finite or countable subcover of 0. fj 


Given any topological space T, the empty set @ and the space T itself 
are both open and closed, by definition. A topological space T is said to 
be connected if it has no subsets other than @ and T which are both open 
and closed. For example, the real line R' is connected, but not the set 
R' — {x} obtained from R! by deleting any point x. 


9.4. Convergent sequences in a topological space. The concept of a con- 
vergent sequence, introduced in Sec. 6.2 for the case of a metric space, 
generalizes in the natural way to the case of a topological space. Thus a 
sequence of points {x,} = X,,X2,...,X,,--. in a topological space T is 
said to converge to a point x € T (called the Jimit of the sequence) if every 
neighborhood G(x) of x contains all points x,, starting from a certain index.® 
However, the concept of a convergent sequence does not play the same basic 
role for topological spaces as for metric spaces. In fact, in the case of a 
metric space R, a point x is a contact point of a set M © R if and only if M 
contains a sequence converging to x. On the other hand, in the case of a 
topological space T, this is in general not true, as shown by Problem II. 
In other words, a point x can be a contact point of a set M © T (1.e., x can 
belong to [M]) without M containing a sequence converging to x. However, 
convergent sequences “are given their rights back’’ if T satisfies the first 
axiom of countability, i.e., if there is a countable neighborhood base at every 
point x eT: ‘ 


THEOREM 7. If a topological space T satisfies the first axiom of 
countability, then every contact point x of a set M — T is the limit of a 
convergent sequence of points in M. 


Proof. Let © be a countable neighborhood base at x, consisting of 
sets O,,. It can be assumed that O,,,; © O, (n = 1, 2,...), since other- 


wise we need only replace O, by () O,. Let x, be any point of M 
k=1 
contained in O,. Such a point x, can always be found, since x is a 


contact point of M. Then the sequence {x,} obviously converges to 


x. | 


Remark. As already noted, every metric space satisfies the first axiom 
of countability. This, together with Theorem 7, shows why in the case of 
metric spaces we were able to formulate concepts like contact point, limit 


* More exactly, if, given any G(x), there is an integer Ng such that G(x) contains all 
points x, with n > Ng. 
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point, etc. in terms of convergent sequences (recall Theorems 2 and 2’, 


p. 48). 


9.5. Axioms of separation. Although many basic concepts of the theory 
of metric spaces carry over easily to the case of topological spaces, an 
arbitrary topological space is still too general an object for most problems 
of analysis. In fact, things can happen in an arbitrary topological space 
which differ in an essential way from what happens in a metric space. Thus, 
for example, a finite set of points need not be closed in an arbitrary topo- 
logical space, as shown in Example 4, p. 79. Hence it is desirable to 
specialize the notion of a topological space somewhat by considering topo- 
logical spaces more closely resembling metric spaces. This is done by 
imposing extra conditions on a topological space 7, in addition to the two 
defining properties figuring in Definition 1, p. 78. For example, as we 
have already seen, the axioms of countability allow us to study topological 
spaces from the standpoint of the concept of convergence. We now introduce 
supplementary conditions, called axioms of separation, of quite a different 
type: 

DEFINITION 4. Suppose that for each pair of distinct points x and y in 
a topological space T, there is a neighborhood O, of x and a neighborhood 


O, of y such that x € O,, y € O,. Then T is said to satisfy the first axiom of 
separation, and is called a T,-space. 


Example 1. The space in Example 2, p. 79 is a T,-space, but not the space 
in Example 4. 


THEOREM 8. Every finite subset of a T,-space is closed. 


Proof. Given any single-element set {x}, suppose y 4 x. Then y 
has a neighborhood O, which does not contain x, i.e., y € [{x}]. There- 
fore [{x}] = {x}, 1e., every “singleton”’ {x} is closed. But every finite 
union of closed sets is itself closed. Hence every finite subset of the given 
space is closed. J 


The next axiom of separation is stronger than the first axiom: 


DEFINITION 5. Suppose that for each pair of distinct points x and y in 
a topological space T, there is a neighborhood O, of x and aneighborhood 
O, of y such that O, A O, = ©. Then T is said to satisfy the second (or 
Hausdorff) axiom of separation, and is called a T.-space or Hausdorff 
space. 


Thus, roughly speaking, each pair of disjoint points in a Hausdorff space 
has a pair of disjoint neighborhoods. 
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Example 2. Every Hausdorff space is a 7,-space, but not conversely (see 
Problem 10). 


Topological spaces more general than Hausdorff spaces are rarely used 
in analysis. In fact, most of the topological spaces of interest in analysis 
satisfy a separation condition even stronger than the second axiom of 
separation: 


DEFINITION 6. A 7-space T is said to be normal if for each pair of 
disjoint closed sets F, and F, in T, there is an open set O, containing F, 
and an open set O, containing F, such that O; 1 O,= ©. 


In other words, each pair of disjoint closed sets in a normal space has a 
pair of disjoint “neighborhoods.” 


Example 3. Obviously, every normal space is a Hausdorff space. 


Example 4. Consider the closed unit interval [0, 1], where neighborhoods 
of any point x ~ 0 are defined in the usual way (i.e., as open sets containing 
x), but neighborhoods of the point x = 0 are all half-open intervals [0, «) 
with the points 


1 
i erage gece 2 
5 (2) 


deleted (and arbitrary unions and finite intersections of these neighborhoods 
with neighborhoods of nonzero points). This space is Hausdorff, but not 
normal since the set {0} and the set of points (2) are disjoint closed sets 
without disjoint neighborhoods. 


THEOREM 9. Every metric space is normal. 


Proof. Let X and Y be any two disjoint closed subsets of R. Every 
point x €¢ X has a neighborhood O, disjoint from Y, and hence is at a 
positive distance e, from Y (recall Problem 9, p. 54). Similarly, every 
point y € Y is at a positive distance e, from X. Consider the open sets 


U= U S(x, 3x), V= U S(y, Py) 
rex yeY 


where, as usual, S(x, 1) is the open sphere with center x and radius r. 
It is clear that ¥ © U, Y < V. Moreover, U and V are disjoint. In fact, 
suppose to the contrary that there is a point ze U M V. Then there are 
points x9 € X, yo € Y such that 

0(Xo:2) < 30x,» (Zs Yo) < 4Py,. 
Assume, to be explicit, that Px, < Py,: Then 


P(%o Yo) < P(X, Z) + P(2s Yo) < 30x, + 20, < Pu,» 
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1.€., Xp € So; 0 Py, ). This contradicts the definition of Py,» and shows that 
there ; is no point zeUny. Jf 


Remark. Every subspace of a metric space is itself a metric space and 
hence normal. This is not true for normal spaces in general, i.e., a subspace 
of a normal space need not be normal.* A property of a topological space 
T shared by every subspace of T is said to be hereditary. Thus normality of a 
space is not a hereditary property. These ideas are pursued in Problems 
13 and 14. 


9.6. Continuous mappings. Homeomorphisms. The concept of a contin- 
uous mapping, introduced for metric spaces in Sec. 5.2, generalizes at once 
to the case of arbitrary topological spaces. Thus, let f be a mapping of one 
topological space X into another topological space Y, so that f associates 
an element y = f(x) € Y with each element xe X. Then f/ is said to be 
continuous at the point x,€X if, given any neighborhood V, of the point 

= f (Xo), there is a neighborhood U, of the point x» such that fU, ) c 
V,, The mapping fis said to be continuous on X if it is continuous at every 
point of X. In particular, a continuous mapping of a topological space X 
into the real line is called a continuous real function on X. 


Remark. These definitions clearly reduce to the corresponding definitions 
for metric spaces in Sec. 5.2 if X and Y are both metric spaces. 


The notion of continuity of a mapping / of one topological space into 
another’ is easily stated in terms of open sets, i.e., in terms of the topologies 
of the two spaces: 


THEOREM 10. A mapping f of a topological space X into a topological 
space Y is continuous if and only if the preimage I’ = f—\(G) of every 
open set G — Y is open (in X). 


Proof. Suppose f is continuous on X, and let G be any open subset 
of Y. Choose any point x e I’ = f-*(G), and let y = f(x). Then Gisa 
neighborhood of the point y. Hence, by the continuity of f, there is a 
neighborhood U, of x such that f(U,) © G,1.e., U, < I’. In other words, 
every point x € I’ has a neighborhood contained in I’. But then I° is 
open (see Problem 1). 

Conversely, suppose I’ = f~1(G) is open whenever G < Y is open. 
Given any point x € X, let V, be any neighborhood of the point y = f(x). 


4 See e.g., J. L. Kelley, General Topology, D. Van Nostrand Co., Inc., Princeton, N.J. 
(1955), p. 132. 

>If desired, the mapping f can always be regarded as ‘‘onto,”’ since otherwise we need 
only replace the space Y by the subspace f(X) < Y. 
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Then clearly x ef—1(V,), and moreover f~(V,) is open, by hypothesis. 
Therefore U, = f—(V,) is a neighborhood of x such that f(U,) < V,. 
In other words, fis continuous at x and hence on_X, since x is an arbitrary 
point of X ff 


Naturally, Theorem 10 has the following “dual’’: 


THEOREM 10’. A mapping f of a topological space X into a topological 
space Y is continuous if and only if the preimage = f~1(F) of every closed 
set F < Yis closed (in X). 


Proof. Use the fact that the preimage of a complement is the comple- 
ment of the preimage. ff 


Remark. Let X and Y be two arbitrary sets, and let f be a mapping of 
X into Y. Suppose that in Y there is specified a topology 7, i.e., a system 
of sets containing Y and @, and closed under the operations of taking 
arbitrary unions and finite intersections. Then since the preimage of a 
union (or intersection) of sets equals the union (or intersection) of the 
preimages of the sets, by Theorems 1 and 2, p. 5, the preimage of the 
topology +, i.e., the system of all sets f-'(G) where Ger, is a topology 
in X which we denote by f~1(r). 

Suppose now that X and Y are topological spaces, with topologies tx 
and ty, respectively. Then Theorem 10, giving a necessary and sufficient 
condition for a mapping f of X into Y to be continuous can be paraphrased 
as follows: A mapping fof X into Y is continuous if and only if the topology 
vx is stronger than the topology f—*(+y). 


Example. It is easy to see that the image (as opposed to the preimage) of 
an open set under a continuous mapping need not be open. Similarly, the 
image of a closed set under a continuous mapping need not be closed. For 
example, consider the mapping of the half-open interval ¥ = [0, 1) onto the 
circle of unit circumference corresponding to “winding’’ the interval onto 
the circle. Then the set [$, 1), which is closed in [0, 1), goes into a set which 
is not closed on the circle (see Figure 12). 
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The theorem on continuity of composite functions, familiar from 
elementary calculus, has the following analogue for topological spaces: 


THEOREM 11. Given topological spaces X, Y and Z, suppose f is a 
continuous mapping of X into Y and » a continuous mapping of Y into Z. 
Then the mapping of, i.e., the mapping carrying x into o(f(x)), is 
continuous. 


Proof. An immediate consequence of Theorem 10. 


Given two topological spaces X and Y, let fbe a one-to-one mapping of X 
onto Y, and suppose f and f~' are both continuous. Then / is called a 
homeomorphic mapping or simply .a homeomorphism (between X and Y). 
Two spaces X and Y are said to be homeomorphic if there exists a homeo- 
morphism between them. Homeomorphic spaces have the same topological 
properties, and from the topological point of view are merely two “‘repre- 
sentatives’ of one and the same space. In fact, if X and Y have topologies 
Tx and ty, respectively, and if fis a homeomorphic mapping of X onto Y, 
then ty = f-'(ty) and ty =f (tx). The relation of being homeomorphic 
is obviously reflexive, symmetric and transitive, and hence is an equivalence 
relation. Therefore any given family of topological spaces can be partitioned 
into disjoint classes of homeomorphic spaces. 


Remark. Again these are the natural generalizations of the same notions 
for metric spaces, introduced in Sec. 2.2. It should be noted that two homeo- 
morphic metric spaces need not have the same “metric properties” (recall 
Problem 9, p. 66). Note also that the topology of a metric space is uniquely 
determined by its metric, but not conversely (illustrate this by an example). 


9.7. Various ways of specifying topologies. Metrizability. The most direct 
and in principle the simplest way of specifying a topology in a space T is to 
indicate which subsets of T are regarded as open. The system of all such 
subsets must then satisfy properties 1) and 2) of Definition 1. By duality, 
we could just as well indicate which subsets of X are regarded as closed. 
The system of all such subsets must then satisfy properties 1’) and 2’) on 
p. 79. However, this method is of limited practical value. For example, in 
the case of the plane it is hardly possible to give a direct description of all 
open sets (as was done in Theorem 6, p. 51 for the case of the line). 

A topology is often specified in a space T by giving a base for 7. In 
fact, this is precisely what is done in Sec. 6 for the case of a metric space R, 
where the base for R consists of all open spheres (or even all open spheres 
with rational radii). 

Another way of specifying a topology in a space T is to introduce the 
notion of convergence in T. As noted in Sec. 9.4, this is not a universal 
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method. It does work, however, in the case of spaces satisfying the first 
axiom of countability.® 

Still another way of introducing a topology in a space T is to specify 
a closure operator in 7, i.e., a mapping which assigns to each subset M < T 
a subset [MZ] < T and satisfies the four properties listed in Theorem 1, 
p. 46. It can be shown that the system of complements of all sets M < T 
such that [MZ] = M is then a topology in T.’ 

Specifying a metric in a space T is one of the most important ways of 
introducing a topology in T, but it is again far from being a universal method. 
As already noted, every metric space is normal and satisfies the first axiom 
of countability. Hence no metric can be used to introduce a topology in a 
space which fails to have these two properties. A topological space T is said 
to be metrizable if its topology can be specified by means of some metric 
(more exactly, if it is homeomorphic to some metric space). As just pointed 
out, a necessary condition for a topological space T to be metrizable is that 
it be normal and satisfy the first axiom of countability. However, it can be 
shown that these conditions are not sufficient for T to be metrizable. On the 
other hand, in the case of a space with a countable base (i.e., satisfying the 
second axiom of countability), we have 


URYSOHN’S METRIZATION THEOREM. A necessary and sufficient condi- 
tion for a topological space with a countable base to be metrizable is that 
it be normal. 


The necessity follows from Theorem 9. For the sufficiency we refer to the 
literature.® 


Problem 1. Given a topological space T, prove that a set G © Tis open if 
and only if every point x € G has a neighborhood contained in G. 


Problem 2. Given a topological space 7, prove that 


a) [M] = M if and only if M is a closed set, i.e., the complement T — G 
of an open set G € T; 

b) [4M] is the smallest closed set containing M; 

c) The closure operator, i.e., the mapping of T into T carrying M into 
[MM] satisfies Theorem 1, p. 46. 


Problem 3. Consider the set 7 of all possible topologies defined in a 
set X, where t, < +, means that t, is weaker than 7t,. Verify that < is a 


*In fact, by suitably generalizing the notion of convergence (and introducing the 
concepts of “‘nets’’ and “‘filters’”), this method can be made to work quite generally. See 
eg., J. L. Kelley, op. cit., p. 83. 

7J. L. Kelley, op. cit., p. 43. 

8 See e.g., P. S. Alexandroff, Einfiihrung in die Mengentlehre und die Theorie der Reellen 
Funktionen, VEB Deutscher Verlag der Wissenschaften, Berlin (1956), p. 195 ff. 


SEC. 9 BASIC CONCEPTS JI 


partial ordering of 7. Does 7 have maximal and minimal elements? If so, 
what are they? 


Problem 4, Can two distinct topologies +, and t, in X generate the same 
relative topology in a subset A © X? 


Problem 5. Let 
X = {a,b,c}, A= {a,b}, B= {b,c}, 


and let Y = {@, X, A, B}. Is FY a base for a topology in X? 


Problem 6. Prove that if M is an uncountable subset of a topological 
space with a countable base, then some point of M is a limit point of M. 


Problem 7. Prove that the topological space T in Example 4, p. 79 is 
connected. 


Comment. T might be called a “‘connected doubleton.” 


Problem 8. Prove that a topological space satisfying the second axiom of 
countability automatically satisfies the first axiom of countability. 


Problem 9. Give an example of a topological space satisfying the first 
axiom of countability but not the second axiom of countability. 


Problem 10. Let + be the system of sets consisting of the empty set and 
every subset of the closed unit interval [0, 1] obtained by deleting a finite 
or countable number of points from X. Verify that T = (X, +) 1s a topological 
space. Prove that 7 satisfies neither the second nor the first axiom of count- 
ability. Prove that T is a T7,-space, but not a Hausdorff space. 


Problem 11. Let T be the topological space of the preceding problem. 
Prove that the only convergent sequences in TJ are the “stationary sequences,”’ 
1.e., the sequences all of whose terms are the same starting from some index 
n. Prove that the set M = (0, 1] has the point 0 as a contact point, but 
contains no sequence of points converging to 0. 


Problem 12. Prove the converse of Theorem 8. 


Comment. Hence a topological space T is a T,-space if and only if every 
finite subset of T is closed. 


Problem 13. Prove the following theorem, known as Urysohn’s lemma: 
Given a normal space T and two disjoint closed subsets F,, F, € T, there 
exists a continuous real function f such that 0 < f(x) < 1 and 


FO) 0 if xeF,, 
4 e— 
( ] if xeFy,. 
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Problem 14. A T,-space T is said to be completely regular if, given any 
closed set F © T and any point x) ¢ T — F, there exists a continuous real 
function f such that 0 < f(x) < 1 and 


0 if x= X05 


La) I if xeF. 


(Completely regular spaces are also called Tychonoff spaces.) Prove that 
every normal space is completely regular, but not conversely. Prove that 
every subspace of a completely regular space (in particular, of a normal space) 
is completely regular. 


Comment. Thus, unlike normality, complete regularity is a hereditary 
property. It can be shown that a space is completely regular if and only if 
it is a subspace of a normal space.® Completely regular spaces are particularly 
important in analysis, since they “are able to support sufficiently many 
continuous functions,”’ i.e., for any two distinct points x and y of a completely 
regular space T, there is a continuous real function on 7 taking distinct 
values at x and y. 


10. Compactness 


10.1. Compact topological spaces. The reader has presumably already 
encountered the familiar 


HEINE-BOREL THEOREM. Any cover of a closed interval [a, b] by a system 
of open intervals (er, more generally, open sets) has a finite subcover. 


Generalizing this property of closed intervals, we are led to a key concept 
of real analysis: 


DEFINITION 1. A topological space T is said to be compact if every open 
cover of T has a finite subcover. A compact Hausdorff space is called a 
compactum. 


Example. As we will see in Sec. 11.2, any closed bounded subset of 
Euclidean n-space R” is compact, for arbitrary n. On the other hand, R” 
itself (e.g., the real line or three-dimensional space) is not compact. 


DEFINITION 2. A system of subsets {A,} of a set T is said to be centered 


n 
if every finite intersection () A; is nonempty.’® 
=1 


®J. L. Kelley, op. cit., p. 145. 
10 A system of sets with typical member A, will often be denoted by {A,} (this is still 


another use of curly brackets). 


SEC. 10 COMPACTNESS 93 


THEOREM |. A topological space T is compact if and only if it has the 
following property: 

A) Every centered system of closed subsets of T has a nonempty 
intersection. 


Proof. Suppose T is compact, and let {F,} be any centered system of 
closed subsets of T. Then the setsG, = T — F,, are open. Hence the fact 


n 
that no finite intersection () F,, is empty implies that no finite system of 


k=1 

sets G, = T — F,, covers J. But then the whole system of sets {G,} cannot 

cover T, by the compactness, and hence () F, ~ @. In other words, 
T has property A) if T is compact. 2 

Conversely, suppose J has property A), and let {G,} be any open 

cover of 7. Setting F, = T — G,, we find that (1) F, = @, which, by 


a 
property A), implies that the system F, is not centered, i.e., that there 
are sets F,,..., F, such that () F, = @. But then the corresponding 


k=1 
open sets G, = T — F, form a finite subcover of the cover {G,}. In 
other words, T is compact if T has property A). J 


THEOREM 2. Every closed subset F of a compact topological space T is 
itself compact. 


Proof. Let {F,} be any centered system of closed subsets of the sub- 
space F < T. Then every F, is closed in Tas well, i.c., {F,} is a centered 
system of closed subsets of 7. Therefore () F, # @, by Theorem 1. 


But then F is compact, by Theorem | again. Jf 


COROLLARY. Every closed subset of a compactum is itself a compactum. 


Procf. Use Theorem 2 and the fact that every subset of a Hausdorff 
space is itself a Hausdorff space. J 


THEOREM 3. Let K be a compactum and T any Hausdorff space con- 
taining K. Then K is closed in T. 


Proof. Suppose y ¢ K, so that ye T— K. Then, given any point 
x € K, there is a neighborhood U, of x and a neighborhood V, of y such 
that 


GOV Vi 
The neighborhoods {U,}(x € K) form an open cover of K. Hence, by the 
compactness of K, {U,} has a finite subcover consisting of sets U,,,..., 
U,,. Let 


V=Vz OT OV,,. 


zy 
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Then V is a neighborhood of the point y which does not intersect the set 
U,, U°:* UU, > K, and hence y ¢ [K]. It follows that K is closed 


(in T). J 


Remark. It is a consequence of Theorems 2 and 3 that compactness is 
an “intrinsic property,’ in the sense that a compactum remains a compactum 
after being ‘“embedded” in any larger Hausdorff space. 


THEOREM 4, Every compactum K is a normal space. 


Proof. Let X and Y be any two disjoint closed subsets of K. Re- 
peating the argument given in the proof of Theorem 3, we easily see that, 
given any point y € Y, there exists a neighborhood U, containing y and 
an open set O, > X such that U, NO O, = ©. Since Y is compact, by 
Theorem 2, the cover {U,}(y € Y) of the set Y has a finite subcover 
U,,,...,U,. The open sets 


V1? 
OM =O, N°°°O0,~,, OM =U, Us UU 


Yn 
then satisfy the normality conditions 
OW) Dd x Ol2) > Y, OM 7 QO”) = om. 7 


10.2. Continuous mappings of compact spaces. Next we show that the 
“continuous image”’ of a compact space is itself a compact space: 


THEOREM 5. Let X be a compact space and f a continuous map ping of X 
onto a topological space Y. Then Y = f (X) is itself compact. 


Proof. Let {V,,} be any open cover of Y, and let U, = f-1(V,). Then 
the sets U, are open (being preimages of open sets under a continuous 
mapping) and cover the space X. Since X is compact, {U,} has a finite 
suocover U,,,<.., Uz,. Then the sets V,,,..., Vz,, where V, = f(U;,), 
cover Y. It follows that Yis compact. J 


THEOREM 6. A one-to-one continuous mapping of a compactum X 
onto a compactum Y is necessarily a homeomorphism. 


Proof. We must show that the inverse mapping /—* is itself continuous. 
Let F be a closed set in ¥ and P = f(F) its image in Y. Then P is a 
compactum, by Theorem 5. Hence, by Theorem 3, P is closed in Y. 
Therefore the preimage under f~ of any closed set F < X is closed. It 
follows from Theorem 10’, p. 88 that f-1 is continuous. J 


10.3. Countable compactness. We begin by proving an important property 
of compact spaces: 


THEOREM 7. If T is a compact space, then any infinite subset of T has 
at least one limit point. 
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Proof. Suppose T contains an infinite set X- with no limit point. Then 
T contains a countable set 


A= Xe Meee 9 Xno > ..S 
with no limit point. But then the sets 


RNa Vig es (M1 ,2,.223) 


form a centered system of closed sets in T with an empty intersection, 
i.e., Zis not compact. ff 


These considerations suggest 


DEFINITION 3. A topological space T is said to be countably compact 
if every infinite subset of T has at least one limit point (in T). 


Thus Theorem 7 says that every compact set is countably compact. The 
converse, however, is not true (see Problem 1). The relation between the 
concepts of compactness and countable compactness is made clear by 


THEOREM 8. Each of the following two conditions is necessary and 
sufficient for a topological space T to be countably compact: 


1) Every countable open cover of T has a finite subcover; 
2) Every countable centered system of closed subsets of T has a non- 
empty intersection. 


Proof. The equivalence of conditions 1) and 2) is an immediate 
consequence of the duality principle. Moreover, if T is not countably 
compact, then, repeating the argument given in proving Theorem 7, 
we find that there is a countable centered system of closed subsets of T 
with an empty intersection. This proves the sufficiency of condition 2). 
Thus we need only prove the necessity of condition 2). Let T be 
countably compact, and let {F,,} be a countable centered system of 
closed sets in 7. Then, as we now show, () F,, # @. Let 


©, = Nf F,. 


k=1 
Then none of the ®, is empty, since {F,,} is centered. Moreover, 
®,> @0,>°°-->@,>-:-, 
and 
NO,=NF,. 


There are now just two possibilities: 


1) O,, = ®,.41 = °° + starting from some index ng, in which case it 
is obvious that (7 2, = 9, ~ @. 
n 
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2) There are infinitely many distinct sets ©,. In this case, there is 
clearly no loss of generality in assuming that all the ®,, are distinct. 
Let x, € D, — ®,.,3. Then the sequence {x,} consists of infinitely 
many distinct points of T, and hence, by the countable compact- 
ness of T, must have at least one limit point, say x». But then x, 
must be a limit point of ®,, since ©, contains all the points x,, 
Xnzis-++ + Moreover x) € ®,, since ®, is closed. It follows that 


xwENO®,, ic, NG, Ao. | 


Thus compact topological spaces are those in which an arbitrary open 
cover has a finite subcover, while countably compact spaces are those in 
which every countable open cover has a finite subcover. Although in general 
countable compactness does not imply compactness, we have the following 
important special situation: 


THEOREM 9. The concepts of compactness and countable compactness 
coincide for a topological space T with a countable base. 


Proof. By Theorem 6, p. 83, every open cover @ of T has a countable 
subcover. Hence, if T is countably compact, @ has a finite subcover, by 
Theorem 8. jf 


Remark. The concept of a countably compact topological space, unlike 
that of a compact space, has not turned out to be very natural or fruitful. 
Its presence in mathematics can be explained in terms of a kind of “historical 
inertia.”” The point is that, as will be shown in the next section, the concepts 
of compactness and countable compactness coincide for metric spaces, as 
well as for spaces with a countable base. The notion of compactness was 
originally introduced in connection with metric spaces, with a compact metric 
space being defined as one in which every infinite subset has at least one 
limit point (i.e., in terms of what is now called “countable compactness’’). 
The “automatic transcription” of this definition from metric spaces to 
topological spaces then led to the concept of a countably compact topological 
space. Sometimes, especially in the older literature, the word ‘“‘compact”’ 
is used in the sense of “countably compact,”’ and a topological space compact 
in our sense (i.e., such that every open cover has a finite subcover) is said 
to be “‘bicompact.” In this older language, a compact Hausdorff space 
(a “‘compactum”’ in our terminology) is called a “bicompactum,” and the 
term ‘““compactum’’ is reserved for a compact metric space. We will adhere 
to the terminology introduced in Definitions 1 and 3, often using the term 
“metric compactum’’ to designate a compact metric space. 


10.4. Relatively compact subsets. Among the subsets of a topological 
space, those whose closures are compact are of special interest: 
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DEFINITION 4. A subset M of a topological space T is said to be rela- 
tively compact (in T) if its closure M in T is compact. 


Example 1. According to Theorem 2, every subset of a compact topo- 
logical space is relatively compact. 


Example 2. As we will see in Sec. 11.3, every bounded subset of the real 
line R1 (or more generally of Euclidean n-space R”) is relatively compact. 


A related concept is given by 


DEFINITION 5. A subset M of a topological space T is said to be rela- 
tively countably compact (in T) if every infinite subset A © M has at least 
one limit point in T (which may or may not belong to M). 


Relative compactness (unlike compactness) is not an “intrinsic property,” 
1.e., it depends on the space T in which the given set M is “embedded.”’ 
For example, the set of all rational numbers in the interval (0, 1) is relatively 
compact if regarded as a subset of the real line, but not if regarded as a subset 
of the space of all rational numbers. The concept of relative compactness 
is most important in the case of metric spaces (see Sec. 11.3). 


Problem 1. Let X be the set of all ordinal numbers less than the first 
uncountable ordinal. Let («, 8) © X denote the set of all ordinal numbers 
y such that «a < y < §, and let the open sets in X be all unions of intervals 
(a, 8). Prove that the resulting topological space is countably compact but 
not compact. 


Problem 2. A topological space T is said to be locally compact if every 
point x € T has at least one relatively compact neighborhood. Show that a 
compact space is automatically locally compact, but not conversely. Prove 
that every closed subspace of a locally compact subspace is locally compact. 


Problem 3. A point x is said to be a complete limit point of a subset A of a 
topological space if, given any neighborhood U of x, the sets A and A NU 
have the same power (i.e., cardinal number). Prove that every infinite subset 
of a compact topological space has at least one complete limit point. 


Comment. Conversely, it can be shown that if every infinite subset of a 
topological space T has at least one complete limit point, then Tis compact.” 


li. Compactness in Metric Spaces 


11.1. Total boundedness. Since metric spaces are topological spaces of a 
special kind, the definitions and results of the preceding section apply to 


11 PS. Alexandroff, op. cit., pp. 250-251; J. L. Kelley, op. cit., pp. 163-164. 
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metric spaces as well. However, in the case of metric spaces, the concept 
of compactness is intimately connected with another concept, known as 
total boundedness. 


DEFINITION 1. Let R be a metric space and € any positive number. Then 
a set A © Ris said to be an s-net for a set M © Rif, for everyx eM, 
there is at least one point a & A such that e(x, a) < «. 


Example 1. The set of all points with integral coordinates is a ( i/V 2)-net. 
Example 2. Every subset of a totally bounded set is itself totally bounded. 


DEFINITION 2. Given a metric space Randasubset M — R, suppose M 
has a finite e-net for every « > 0. Then M is said to be totally bounded. 


If a set M is totally bounded, then obviously so is its closure [M]. Every 
totally bounded set is automatically bounded, being the union of a finite 
number of bounded sets (recall Problem 5, p. 65). The converse is not true, 
as shown in Example 4. 


Example 3. In Euclidean n-space R”, total boundedness is equivalent to 
boundedness. In fact, if 14 < R is bounded, then M is contained in some 
sufficiently large cube Q. Partitioning Q into smaller cubes of side ¢, we find 


that the vertices of the little cubes form a finite (\/ne/2)-net for Q and hence 
(a fortiori) for any set contained in Q. © 


Example 4. The unit sphere & in i, with equation 


>) x 1; 
n=) 
is bounded but not totally bounded. In fact, consider the points 
e, = (1,0,0,...), 6 — (0,40 e035 
where the nth coordinate of e, 1s one and the others are all zero. These 
points all lie on &, and the distance between any two of them is \/2. Hence 
& cannot have a finite e-net with « < /2/2. 


Example 5. Let II be the set of points x = (%, %,...,X,,..-) in J 
satisfying the inequalities 


] 
[x| <l, [xsl < 7’ oe 39 |x,,| < 


qn? 


The set II, called the Hilbert cube (or fundamental parallelepiped) furnishes 


#2 Another commonly encountered definition of the Hilbert cube is the set of points 
in /, satisfying the inequalities 
1 


x1] < 1, IXal S50 +s 


I 
6 eee 
n 
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an example of an infinite-dimensional totally bounded set. The fact that II 
is totally bounded can be seen as follows: Given any ¢ > 0, choose n such 
that 


Ma se 
ya 9? 
and with each point 
Ysa os ies) 
in II associate the point 
MP = (is Bayes ark 5 0,0, % 5,5) (1) 


(x* is also a point in IT). Then 


oo oO 4 1 rs 
(x, x*)= | > x2< —~<——<-, 


But the set II* of all points in IT of the form (1) is totally bounded, being 
a bounded set in n-space. Let A be a finite (¢/2)-net in I1*. Then A is a finite 
e-net for the whole set IT. 


11.2. Compactness and total boundedness. We now show the connection 
between the concepts of compactness (of both kinds) and total boundedness: 


THEOREM 1. Every countably compact metric space R is totally bounded. 


Proof. Suppose R is not totally bounded. Then there is an &) > 0 
such that R has no finite eg-net. Choose any point a,¢ R. Then R 
contains at least one point, say @,, such that 


0(4,, dz) > Ep, 


since otherwise a, would be an e,-net for R. Moreover, R contains a 
point az such that 
p(y, a3) > &, (az, 3) > Eq, 


since otherwise the pair @,, @, would be an €,-net for R. More generally, 
once having found the points a,, ag,...,@,, we choose a,,, € R such 
that 

P(Ax, Ania) > Eo (Kb aI) 


This construction gives an infinite sequence of distinct points a), d,,..., 
a,,... With no limit points, since o(a;, a,) > & if j Ak. But then R 
cannot be countably compact. jj 


CoROLLARY 1. Every countably compact metric space has a countable 
everywhere dense subset and a countable base. 


Proof. Since Ris totally bounded, by Theorem 1, Rhasa finite (1/n)-net 
for everyn = 1,2,... . The union of all these nets is then a countable 
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everywhere dense subset of R. It follows from Theorem 5, p. 82 that R 
has a countable base. J 


COROLLARY 2. Every countably compact metric space is compact. 


Proof. An immediate consequence of Corollary 1 and Theorem 9, 
p. 96. fj 


According to Theorem 1, total boundedness is a necessary condition for 
a metric space to be compact. However, this condition is not sufficient. For 
example, the set of rational points in the interval [0, 1] with the ordinary 
definition of distance forms a metric space R which is totally bounded but 
not compact. In fact, the sequence of points 


0, 0.4, 0.41, 0.414, 0.4142,... 


in R, i.e., the sequence of decimal approximations to the irrational number 
xf 2—1, has no limit point in R. Necessary and sufficient conditions for 
compactness of a metric space are given by 


THEOREM 2. A metric space R is compact if and only if it is totally 
bounded and complete. 


Proof. To see that compactness of R implies completeness of R, 
we need only note that if R has a Cauchy sequence {x,} with no limit, 
then {x,} has no limit points in R. This, together with Theorem 1, 
shows that R is totally bounded and complete if R is compact. 

Conversely, suppose R is totally bounded and complete, and let {x,} 
be any infinite sequence of distinct points in R. Let N, be a finite I-net 
for R, and construct a closed sphere of radius 1 about every point of Nj. 
Since these spheres cover R and there are infinitely many of them, at least 
one of the spheres, say S,, contains an infinite subsequence 


(1) (1) 
xy p22 ea Xn gece 


of the sequence {x,}. Let N, bea finite 3-net for R, and construct a closed 
sphere of radius } for every point of Nz. Then at least one of these 
spheres, say S., contains an infinite subsequence 


(2) (2) 
My see eg Xp gree 


of the sequence {x‘!}, Continue this construction indefinitely, finding 
a closed sphere S, of radius 7 containing an infinite subsequence 


(3) (3) 
My ge ee gag Xy pees 


we 


of the sequence {x‘?)}, and so on, where S,, has radius 1/2”-1. Let S’ be 
the closed sphere with the same center as S,, but with a radius r,, twice as 
large (i.e., equal to 1/2”). Then clearly 


S; > S8,>°¢-°> 8S, >°¢°°, 
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and moreover r, — 0 as n — oo. Since R is complete, it follows from 
the nested sphere theorem (Theorem 2, p. 60) that 


NS, A go. 
n=1 
In fact, there is a point x, € R such that 
MS, = {xo} 
n=1 


(recall Problem 3, p. 65). Clearly x9 is a limit point of the original 
sequence {x,}, since every neighborhood of x, contains some sphere S,, 
and hence some infinite subsequence {x}. Therefore every infinite 
sequence {x,,} of distinct points of R has a limit point in R. It follows that 
R is countably compact and hence compact, by Corollary 2. Jj 


Example. As already noted, a subset M of Euclidean n-space R® is totally 
bounded if and only if it is bounded. Moreover, M is complete if and only if 
it is closed (recall Problem 7, p. 66). Hence, by Theorem 2, the set of all 
compact subsets of R” coincides with the set of all closed bounded subsets 
of R”. 


11.3. Relatively compact subsets of a metric space. The concept of relative 
compactness, introduced in Sec. 10.4 for subsets of an arbitrary topological 
space, applies in particular to subsets of a metric space. In the case of a 
metric space, however, there is no longer any distinction between relative 
compactness and relative countable compactness. 


THEOREM 3. A subset M of a complete metric space R is relatively 
compact if and only if it is totally bounded. 


Proof. An immediate consequence of Theorem 2 and the fact that a 
closed subset of a complete metric space is itself complete. J 


Example. Any bounded subset of Euclidean n-space it totally bounded 
and hence relatively compact (this is our version of the familiar Bolzano- 
Weierstrass theorem). 


Remark. The utility of Theorem 3 stems from the fact it is usually easier 
to prove that a set is totally bounded than to give a direct proof of its relative 
compactness. On the other hand, compactness is the key property as far as 
applications are concerned. 


11.4. Arzela’s theorem. The problem of proving the compactness of 
various subsets of a given metric space is encountered quite frequently in 
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analysis. However, the direct application of Theorem 2 is not always easy. 
This explains the need for special criteria serving as practical tools for proving 
compactness in particularspaces. For example, as we have seen, the bounded- 
ness of a set in Euclidean n-space implies its compactness, but this implication 
fails in more general metric spaces. 

One of the most important metric spaces in analysis is the function space 
Cra,by> Introduced in Example 6, p. 39. For subsets of this space, we have 
an important and frequently used criterion for relative compactness, called 
Arzela’s theorem, which will be stated and proved after first introducing two 
new concepts: 


DEFINITION 3. A family © of functions 9 defined on a closed interval 
[a, b] is said to be uniformly bounded if there exists a number K > 0 such 
that 
lex) << K 


for all x € [a, b] and all pe ®. 


DEFINITION 4. A family © of functions 9 defined on a closed interval 
[a, b] is said to be equicontinuous if, given any « > 0, there exists a number 
3 > 0 such that |x’ — x"| < 8 implies 


lo) — ex") <e 
for all x’, x" € [a, b] and all ge. 


THEOREM 4 (Arzela). A necessary and sufficient condition for a family 
® of continuous functions 9 defined on a closed interval [a,b] to be 
relatively compact in C,, ,, is that © be uniformly bounded and equi- 
continuous. 


Proof. We give the proof in two steps: 


Step I (Necessity). Suppose © is relatively compact in C,, ,,. Then 
by Theorem 3, given any < > 0, there is a finite (¢/3)-net 9,..., Op 
in ® (see Problem 1). Being a continuous function defined on a closed 
interval, each ¢, is bounded: 


lo(mMl<K, (a<x<b). 
Let 


K = max {Ky,...,K,} + 3. 
By the definition of an (e/3)-net, given any 9 € ®, there is at least one 9, 
such that 


e( 9, p;) = max | e(x) — 9,(x)| < = ; 
a<eXb 3 


SEC. 11 COMPACTNESS IN METRIC sPACES 103 


Therefore 
le) < le + 3< Ke +5<Xk, 


i.e., D is uniformly bounded. Moreover, each function 9, in the (e/3)-net 
is continuous, and hence uniformly continuous, on [a, 5]. Hence, given 
any ¢ > 0, there is a 8, such that 


Lo:(%1) — 9(%2)| < : 


whenever |x, — x2| < 8,. Let 
6 = min {d,,...,8,}. 
Then, given any » € ® and choosing ¢, such that p(@, 9,;) < ¢/3, we have 


| p(x1) — 9(%2)| 
< |9(x%1) — 9(%1)] + 19:41) — 9:(%2)| + 19,02) — 9(%2)| 


€ € € 
<-+-4-=¢ 
3 3 = 3 
whenever |x, — x,| < 8. This proves the equicontinuity of ®. 


Step 2 (Sufficiency). Suppose ® is uniformly bounded and equi- 
continuous. According to Theorem 3, to prove that ® is relatively com- 
pact in C,,,,, we need only show that ® is totally bounded, i.e., that 
given any ¢ > 0, there exists a finite e-net for ® in C,, ,,. Suppose 
|o(x)| < K for all 9 € ®, and let 8 > 0 be such that 


lea) — G2)1 <= 


for all ¢ © ® whenever |x, — x,| < 8. Divide the intervala<x<b 
along the x-axis into subintervals of length less than 8, by introducing 
points of subdivision x9, %1, X2,..., X, such that 


AN, oy ey ee ee, > 


and then draw a vertical line through each of these points. Similarly, 
divide the interval —K < y < K along the y-axis into subintervals of 
length less than e/5, by introducing points of subdivision yo, yy, Ye... » Vp 
such that 

TK = Yo << Yas <= K, 


and then draw a horizontal line through each of these points. In this 
way, the rectanglea < x < b, -M < y < Mis divided into np cells of 
horizontal side length less than 8 and vertical side length less than ¢/5. 
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We now associate with each function 9 € a polygonal line y = (x) 
which has vertices at points of the form (x,, y,) and differs from the 
function ¢ by less than ¢/5 at every point x, (the reader should draw a 
figure and convince himself on the existence of such a function). Since 


lo) — Yi) <=, 


| 2( Xn) — VO%er)] < : , 


p(x) — O(%aa)| < = 


by construction, we have 


[b(x) — bOmal < = | 


Moreover, 
3¢ 
| L(x,,) — b(x)| < 5 (x, <x < Xpra)s 


since (x) is linear between the points x, and x,,,. Let x be any point 
in [a, b] and x, the point of subdivision nearest to x on the left. Then 


lo(x) — bX)I < 1%) — oC) + lee) — Or)! + 14On) — YO) < €, 


ie., the set of polygonal lines y(x) forms an e-net for D. But there 
are obviously only finitely many such lines. Therefore ® is totally 
bounded. § 


11.5. Peano’s theorem. Arzela’s theorem has many applications, among 
them the following existence theorem for differential equations: 


THEOREM 5 (Peano). Let f(x, y) be defined and continuous on a plane 
domain G. Then at least one integral curve of the differential equation 


= = f(x, y) (2) 
x 


passes through each point (Xo, Vo) of G. 
Proof. By the continuity of f; we have 
If Yi < K 


in some domain G’ € G containing the point (x9, yo). Draw the lines 
with slopes K and —X through the point (x, yo). Then draw vertical 
lines x = a and x = b (a < xy < 5) which together with the first two 
lines form two isosceles triangles contained in G’ with common vertex 
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FIGURE 13 


(Xo, Yo), aS Shown in Figure 13. This gives a closed interval [a, 6], which 
will figure in the rest of the proof. 

The next step is to construct a family of polygonal lines, called Euler 
lines, associated with the differential equation (2). We begin by drawing 
the line with slope f(x, yo) through the point (x9, yo). Next, choosing a 
point (x, y,) on the first line, we draw the line with slope f(x, y,) through 
the point (x,, y,). Then, choosing a point (x2, yz) on the second line, we 
draw the line with slope f (x2, y.) through the point (x2, y.), and so on 
indefinitely. Suppose we construct a whole sequence Lj, L,,...,Ly,... 
of such Euler lines going through the point (9, yo), with the property 
that the length of the longest line segment making up L,, approaches 0 
asn — 00. Let ¢, be the function with graph L,. Then this gives a family 
of functions 9, @o,...,9,,-.., all defined on the interval [a, 6], which 
is easily seen to be uniformly bounded and equicontinuous (why?). It 
follows from Arzela’s theorem that the sequence {9,} contains a uni- 


formly convergent subsequence 9), op), ..., o'”,... Let 
(x) =lim o"”'(x). 
Then clearly eel 
(Xo) = Yo, 


so that the curve y = (x) passes through the point (x9, yo). 

We now show that y = 9(x) satisfies the differential equation (2) in 
the open interval (a, 5). This means showing that, given any « > 0 and 
any points x’, x” € (a, b), we have 


o(x”) is ?(x’) — f(x’, (x’)) 


fA / 


X aX 
whenever |x” — x’| is sufficiently small, or equivalently that 


p(x") i o'”) (x") 


<eé 


Faas — f(x, O(%’)) 


XS 


<e€ (3) 
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whenever n is sufficiently large and |x” — x’| is sufficiently small. Let 
y’ = 9(x’). Then, by the continuity of f, given any ¢ > 0, there is a 
number y > 0 such that 


Ie. y)-—e<f@y<fOery)+e 
whenever 
Ix —x'|<2yn, |y—y'| < 4Ky. 


The set of points (x, y) satisfying these inequalities is a rectangle, which 
we denote by Q. Let N be so large that for all n > N, the length of the 
longest segment making up L,, is less than y and moreover 


lp(x) — 9 (x)| < Kn. 
Then all the Euler lines ZL, with n > N lie inside the rectangle Q (why ?). 
Suppose L,, has vertices (a, by), (@1, 51), - - - 5 (Gras Oxia), Where 
Mpmix <a <a <tc ay <x" < ayy. 
Then 
og!) (ay) — 9 (x) = f(y, doy) (ar — x’), 
0" (ay43) — 9 (a,) = f (GQ, bi) (Qiu, — a) (G@=1,2,...,k — 1), 
p(x") — 9 (ay) = f (Ay, by)(x" — ay). 
Hence, if |x” — x’| < », 
U(x’, v’) — ela — x’) < 9G) — P(MX)< YO.) +el(a— x’), 
O's ¥) — liza — 4) < 9 (Gigs) — 9 (G) 
<[f',y) + €](@:41 — @,) 1 2h gd = 1); 
if’) _ e](x" aay) < o!”) (x” = 0!) (a,)< f(x’, y’) a e](x” rae a;). 
Adding these inequalities, we get 
f(x’, y’) — e](x” a x’) < p(x”) = ol (x'\< [f(x,y )+te](x" ne x’) 
if |x” — x’| < y, which is equivalent to (3). § 


Remark. Different subsequences of a sequence of Euler lines may con- 
verge to different solutions of the differential equation (2). Hence the solution 
~ found in the proof of Theorem 5 may not be the unique solution of (2) 
passing through the point (x9, yo). 


Problem 1. Let M be a totally bounded subset of a metric space R. Prove 
that the e-nets figuring in the definition of total boundedness of M can always 
be chosen to consist of points of M rather than of R. 


18 To be explicit, we assume that x” > x’. The case x” < x’ is treated similarly. 


SEC. 11 COMPACTNESS IN METRIC SPACES’ 107 


Hint. Given an e-net for M consisting of points a,, a,,...,a, €R, all 
within ¢ of some point of M, replace each point a, by a point b, € M such 
that o(a,, b,) < «. 


Problem 2. Prove that every totally bounded metric space is separable. 


Hint. Construct a finite (1/n)-net for every n = 1,2,... Then take the 
union of these nets. 


Problem 3. Let M be a bounded subset of the space C,, ,,. Prove that the 
set of all functions 


F(x) = | fat 
with f ¢ M. compact. 


Problem 4. Given two metric compacta X and Y, let Cyy be the set of 
all continuous mappings of X into Y. Let distance be defined in Cy» by the 
formula 


o(f, 8) = sup e( f(x), g(x)). (4) 


Prove that Cyy is a metric space. Let Myy be the set of all mappings of 
X into Y, with the same metric (4). Prove that Cyy is closed in Myy. 


Hint. Use the method of Problem 1, p. 65 to prove that the limit of a 
uniformly convergent sequence of continuous mappings is itself a continuous 
mapping. 

Problem 5. Let X, Y and Cyy be the same as in the preceding problem. 
Prove the following generalization of Arzela’s theorem: A necessary and 
sufficient condition for a set D © Cyy to be relatively compact is that 
D be an equicontinuous family of functions, in the sense that given any « > 0, 
there exists a number 5 > Osuch that o(x’, y’) < Simplies e( f(x’), f(x") < « 
for all x’, x” € X and all fe D. 


Hint. To prove the sufficiency, show that D is relatively compact in 
M xy (defined in the preceding problem) and hence in Cyy, since Cyy is 
closed in Myy. To prove the relative compactness of D in Myy, first 
represent X as a union of finitely many pairwise disjoint sets EZ; such that 
x’, x” € E, implies o(x’, x”) < 6. For example, let x,,..., x, be a (3/2)-net 
for X, and let 

E,; = S{x,, 8] — U S[x; , 8]. 
eo} 
Then let y,,..., y, be an e-net in Y, and let L be the set of all functions 
taking the values y, on the sets E;. Given any fe D and any x, € {x,,..., Xn}; 
let y, € {y1, .- +5 Yn} be such that p(f (x,), y;) < ¢ and let g € L be such that 
g(x,) = y,;. Show that e( f(x), g(x)) < 2e, thereby proving that L is a finite 
2<-net for Din Myy. 
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12. Real Functions on Metric and Topological Spaces 


12.1. Continuous and uniformly continuous functions and functionals. Let T 
be a topological space, in particular a metric space. Then by a real function 
on T we mean a mapping of T into the space R’ (the real line). For example, 
a real function on Euclidean n-space R” is just the usual “function of n 
variables.” Suppose T is a function space, i.e., a space whose elements are 
functions. Then a real function on T is called a functional. 


Example 1, Let x(t) be a function defined on the interval [0, 1], let 
@(So, 51, +++ 5 5,) be a function of n + 1 variables defined for all real values 
of its arguments, and let Y(t, u) be a function of two variables defined for 
all ¢ € [0, 1] and all real u. Then the following are all functionals: 

F(x) = sup x(t), 
Oo<t<l 


F,(x) = inf x(2), 


F(x) = x(t) where 1,6 [0, 1], 
F,(x) = 9[x(to), x(4),--- » X(tn)I 
Pox) =[} ole, x] at 

F(x) = x'(to) where ft, & [0, 1], 


F(x) = vi + x(t) dt, 
F(x) = [x1 at. 


The functionals F,, Fy, Fs, F, and F; are defined on the space C of all 
functions continuous on the interval [0, 1]. On the other hand, F, is defined 
only for functions differentiable at the point f), F, is defined only for functions 
such that the expression ree x'*(t) is integrable, and F, is defined only for 
functions with integrable |x’(1)|. 


Example 2. The functional F, 1s continuous on C, since 


o(x, y) = sup |x — yI, sup x — sup y| < sup |x — yj. 
Example 3. The functional F, is discontinuous on C at any point x) where 
it is defined. In fact, let x(t) be such that x’(t,) = 1 and |x(f)| < e, and let 
y=x4t+-x. Then y'(to) = xo(to) + 1 even though (x, y) < «. However, 
Fe is continuous if it is defined on the space C™ of all functions continuously 
differentiable on the interval [0, 1], with metric 
e(x, y) = sup [lx(t) — y@)| + |x’) — yO] 


o<t<l 


(why ?). 
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Example 4. The function F, is also discontinuous on C. In fact, let 


xo(t) = 0, x,(t) = sin 27nt. 
n 
Then 


(x, Xo) ime : => 0, 
n 


but F,(x,) > 4 for all n while F,(x,) = 1. Hence F,(x,) fails to approach 
F(x) even though x,, > Xp. 


The ordinary concept of uniform continuity generalizes at once to the 
case of arbitrary metric spaces: 


DEFINITION 1. A real function f(x) defined on a metric space R is said 
to be uniformly continuous on R if, given any « > 0, there isa 8 > 0 such 
that (x, x2) < 8 implies | f (x4) — f(x2)| < € for all x1, x. E R. 

The reader will recall from calculus that a real function continuous on a 
closed interval [a, 5] is uniformly continuous on [a, b]. This fact is a special 
case of 


THEOREM |. A real function f continuous on a compact metric space R 
is uniformly continuous on R. 


Proof. Suppose f is continuous but not uniformly continuous on R. 
Then for some positive « and every n there are points x, and x/ in R such 
that 


(Xm x4) < . (1) 
but 
fen) — f(x] > e. (2) 


Since Ris compact, the sequence {x,,} has a subsequence {x,, } converging 
to a point x e R. Hence {x;,} also converges to x, because of (1). But 
then at least one of the inequalities 


f(x) —f(%n,)1 > = If) —fxi,)1 > 


must hold for arbitrary k, because of (2). This contradicts the assumed 
continuity of fatx. | 


12.2. Continuous and semicontinuous functions on compact spaces. As just 
shown, the theorem on uniform continuity of a function continuous on a 
closed interval generalizes to functions continuous on arbitrary metric 
compacta. There are other properties of functions continuous on a closed 
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interval which generalize to arbitrary compact spaces (not necessarily metric 
spaces): 
THEOREM 2. A real function f continuous on a compact topological 
space T is bounded on T.\* Moreover f achieves its least upper bound and 
greatest lower bound on T. 


Proof. A continuous real function on T is a continuous mapping of 
T into the real line R!. The image of Tin Ris compact, by Theorem 5, 
p. 94. But every compact subset of R’ is bounded and closed (see p. 
101). Hence fis bounded on T. Moreover, fnot only has a least upper 
bound and greatest lower bound on 7, but actually achieves these bounds 
at points of 7. J 


Theorem 2 can be generalized to a larger class of functions, which we 
now introduce: 


DEFINITION 2. A (real) function f defined on a topological space T is 
said to be upper semicontinuous at a point x, € T if, given any « > 0, there 
exists a neighborhood of xy in which f(x) < f(%o) + ¢. Similarly, fis said 
to be lower semicontinuous at x, if, givenany « > 0, there exists a neighbor- 
hood of Xo in which f(x) > f (x9) — €. 


Example 1. Let [x] be the integral part of x, i.e., the largest integer <x. 
Then f(x) = [x] is upper semicontinuous for all x. 


Example 2. Given a continuous function f, suppose we increase the value 
J (%q) taken by fat the point x». Then f becomes upper semicontinuous at xp. 
Similarly, f becomes lower semicontinuous at x, if we decrease f(x). 
Moreover, f is upper semicontinuous if and only if —f is lower semicon- 
tinuous. These facts can be used to construct many more examples of 
semicontinuous functions. 


In studying the properties of semicontinuous functions, it is convenient 
to allow them to take infinite values. If f(x)) = +00, we regard fas upper 
semicontinuous at x9. The function f is also regarded as lower semicon- 
tinuous at x, if, given any h > 0, there is a neighborhood of x, in which 
f(x) > h. Similarly, if f(x) = —, we regard f as lower semicontinuous 
at x, and at the same time upper semicontinuous at Xp if, given any h > 0, 
there is a neighborhood of x, in which f(x) < —h. 


We now prove the promised generalization of Theorem 2: 


14 A real function (or functional) fis said to be bounded on a set E if f(E) is contained 
in some interval [—C, C]. 
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THEOREM 2’. A finite lower semicontinuous function f defined on a 
compact topological space T is bounded from below. 


Proof. Suppose to the contrary that inf f(x) = —oo. Then there 
exists a sequence {x,} such that f(x,) < —n. Since T is compact, the 
infinite set E = {x,,Xo,...,Xn,--.} has at least one limit point Xp. 
Since fis finite and lower semicontinuous at xy, there is a neighborhood 
U of xo in which f(x) > f(x) — 1. But then U can only contain finitely 
many points of £, so that x, cannot be a limit point of E. fj 


THEOREM 2”. A finite lower semicontinuous function f defined en a 


compact topological space T achieves its greatest lower beund on T. 


Proof. By Theorem 2’, inf f(x) is finite. Clearly, there exists a 
sequence {x,} such that 


Sn) < inf fo) +>, 


By the compactness of T, the set E = {x,, %2,...,X,,...} has at least 
one limit point x9. If f(x) > inf f, then, by the semicontinuity of fat xq, 
there is a neighborhood U of the point x, and a 3 > 0 such that f(x) > 
inf f + 8 for all x e U. But then U cannot contain an infinite subset of 
E, i.e., X) cannot be a limit point of x9. It follows that f(x.) =inff. J 


Remark. Theorems 2’ and 2” remain true if the words “‘lower,”’ ‘‘below,”’ 
and “‘greatest’’ are replaced by “upper,’’ “above,’’ and “‘least.”’ The details 
are left as an exercise. 


We conclude this section with some useful terminology: 


DEFINITION 3. Given a real function f defined on a metric space R, the 
(finite or infinite) quantity 


F(x) = lim sup fa)| 


&>0 (xeS(xy-¢) 


is called the upper limit of f at xo, while the (finite er infinite) quantity 
f(%) = lim inf feo 
a &—>0 LES(xo,e) 


is called the lower limit of f at x9. The difference 


of (x0) = f(%0) — f (0); 


provided it exists,® is called the oscillation of f at xo. 


5 T.e., provided at least one of the numbers f (Xp), fp) is finite. 
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(a) (2) (¢c) 


FiGure_E 14 


12.3. Continuous curves in metric spaces. Instead of mappings of a metric 
space into the real line, we now consider mappings of a subset of the real 
line into a metric space. More exactly, let P = f(t) be a continuous map- 
ping of the interval a < t < b into a metric space R. As ¢ “‘traverses’’ the 
interval from a to b, the point P = f(t) “traverses a continuous curve” in 
the space R. Before giving a formal definition corresponding to this rough 
idea of a “curve,” we make two key observations: 


1) The order in which points are traversed will be regarded as an essential 


2) 


property of a curve. For example, the set of points shown in Figure 
14(a) gives rise to two distinct curves when traversed in the two distinct 
ways shown in Figures 14(b) and 14(c). Similarly, the function shown 
in Figure 15(a), defined in the interval0 < ¢ < 1, determines a “curve” 
filling up the segment 0 < y < 1 of the y-axis, but this curve is traversed 
three times (twice upward and once downward) and hence is distinct 
from the segment 0 < y < | traversed just once from the point y = 0 
to the point y = 1. 


The choice of the parameter ¢ will be regarded as unimportant, 
provided a change in parameter does not change the order in which 
the points of the curve are traversed. Thus the functions shown in 
Figures 15(a) and 15(b) represent the same curve, even though a given 
point of the curve corresponds to different parameter values in the 
two cases. For example, the point A in Figure 15(a) corresponds to 


FiGuRE 15 
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two isolated points C and D on the f-axis, while in Figure 15(b) the 
same point A corresponds to an isolated point C and a whole line 
segment DE (note that the point on the curve does not move at all 
as ft traverses the segment DE). 


We now give a formal definition of a curve, embodying these qualitative 
ideas. Two continuous functions 


P=fi(t’), P = g(t’), 
defined on intervals 


, , 


a<t 


u 


< b’, a" < 1” < b” 


and taking values in a metric space R, are said to be equivalent if there exist 
two continuous nondecreasing functions 


r=), t=), 


defined on the same interval 


a<t<b, 
such that 
p(a)=a, 9(b)=08, 
(a) =a a’, (db) a b” 
and 


f(t) = g(4@) for all te [a, 5]. 


It is easy to see that this relation of equivalence is reflexive (f is equivalent 
to f), symmetric (if fis equivalent to g, then g is equivalent to f) and transitive 
Gif fis equivalent to g and g is equivalent to h, then f is equivalent to h). 
Hence the set of all continuous functions of the given type can be partitioned 
into classes of equivalent functions (cf. Sec. 1.4), and each such class is said 
to define a (continuous) curve in the space R. 

For each function P = f(t’) defined on an interval [a’, b’], there is an 
equivalent function defined on the interval [a", b”] = [0, 1]. In fact, we need 
only make the choice 


t'=o(t)=(6 —a)tt+a, r’=d(1)=t. 


Thus every curve can be regarded as specified parametrically in terms of a 
function defined on the unit interval J = [0, 1]. By the same token, it is 
often convenient?® to introduce the space CU, R) of continuous mappings f 
of the interval J into the space R, equipped with the metric 


e(f, g) == Sup 0), g(t)), (3) 


where 0 is the metric in the space R. 


16 Cf, Problems 7-12. 
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Problem 1. Let the functionals Fy, Fo, F3, Fy, Fs; and the space C be the 
same as on p. 108. Prove that 


a) Fy, F, and Fs are continuous on C : 
b) F,is continuous on Cif the function is continuous in all its arguments ; 
c) F, is uniformly continuous on C. 


Define F,, 2, fF; and F, on a space larger than C. 


Problem 2. Let the functionals F,, F, and the spaces C, C) be the same 
as on p. 108. Prove that 


a) F, 1s discontinuous on C; 
b) F, and F, are continuous on C™, 


Problem 3. Let M be the space of all bounded real functions defined on 
the interval [a, b], with metric e(/, g) = sup |f — g|. By the length of the 
curve 


y=f) (a<x<b) 
is meant the functional 


Lf) = sup ¥ Vom — xa) + FO) — Se 


where the least upper bound (which may equal +00) is taken over all possible 
partitions of [a,5] obtained by introducing points of subdivision x, ¥;, 
Xg,..-.,X, such that 


a= Xp <x %y< Xe <x, =D. 
Prove that 
a) For continuous functions 


L(f) = lim J > (x; — x4)" + (f(x) — f(x) 


max |#;—2;_1|~0 


b) For continuously differentiable functions 
5 
Lf) = [Pvt + FQ) ax: 
c) The functional L(f) is lower semicontinuous on M. 


Problem 4. Let f, f and w be the same as in Definition 3. Prove that 


a) f is upper semicontinuous; 

b) fis lower semicontinuous ; 

c) fis continuous at x, if and only if —oo < f(x9) = f(%) < 0, i.e., if 
and only if wf (x,) = 0. ; 
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Problem 5. Let K be a metric compactum and A a mapping of K into 
itself such that e(Ax, Ay) < e(x, y) if x A y. Prove that A has a unique 
fixed point in K. Reconcile this with Problem 1, p. 76. 


Problem 6. Let K be a metric compactum and {/f,(x)} a sequence of 
continuous functions on K, increasing in the sense that 


Ai®) <fi®) <-++< fi) <--- 


Prove that if {f,(x)} converges to a continuous function on K, then the 
covergence is uniform (Dini’s theorem). 


Problem 7. A sequence of curves {I’,} in a metric space R is said to 
converge to a curve I’ in R if the curves I’,, and I have parametric repre- 
sentations 

P=f,t) (0<t<J 
and 

P=f() O<t< J, 
respectively, such that 
where 6 is the metric (3) of the space C(/, R) introduced on p. 113. Prove 
that if a sequence of curves in a compact metric space R can be represented 
parametrically by an equicontinuous family of functions on [0, 1], then the 
sequence contains a convergent subsequence. 


Hint. Use Problem 5, p. 107. 


Problem 8. Let I’ be a curve in a metric space R, with parametric repre- 
sentation 


P=f(t) (a<t< Db). 
By the /ength of I’ is meant the functional 


IAL) = Lf) = sup ¥ (fia). f(t). 


where op is the metric in R and the least upper bound (which may equal +00) 
is taken over all possible partitions of [a, b] obtained by introducing points 
of subdivision fo, tf), fg,.--5t,5--.- Such that 


a=tp<t<<--+ <t,=b. 


Prove that L(I') is independent of the parametric representation of I. 
Suppose we choose a = 0, b = 1, thereby confining ourselves to parametric 
representations of the form 


P=fi(t) (O<t< 1). 


116 TOPOLOGICAL SPACES CHAP. 3 


Prove that L(f) is then a lower semicontinuous functional on the space 
C(U/, R) introduced on p. 113. Equivalently, prove that if a sequence of 
curves {I',,} converges to:a curve I’, in the sense of Problem 7, then L(T’) 
does not exceed the smallest limit point (i.e., the lower limit) of the sequence 


{L(T’,)}. 


Problem 9. Given a metric space R with metric p, let I’ be a curve in R 
of finite length S with parametric representation 


P=f(t) (a<t<b). 
Let s = (7) be the length of the arc 


P=f(t) (a<t<T) 


(where T < 5), i.e., the arc of I’ going from the “initial point” P, = f(a) 
to the “final point” P, = f(T). Then I’ has a parametric representation 
of the form 


P = g2(s) (O<s<S), 
where g2(s) = f(71(s)) if © is one-to-one. Prove that 
e(g(s1), 8(52)) < [51 — Sel. 
Hint. The length of an arc is no less than the length of the inscribed chord. 
Problem 10. In the preceding problem, let t = s/S. Then I" has a para- 
metric representation 
P=F(t)=g(St) (O<t< 1) 


in terms of a function F defined on the unit interval [0,1]. Prove that 
F satisfies a Lipschitz condition of the form 


e(F (71), F(t%2)) < Slt — tl. 


Suppose R is compact and let {I',} be a sequence of curves, all of length 
less than some finite number M. Prove that {I’,,} contains a convergent 
subsequence, where convergence of curves is defined as in Problem 7. 


Problem 11. Given a compact metric space R, suppose two points A and B 
in R can be joined by a continuous curve of finite length. Prove that among 
all such curves, there is a curve of least length. 


Comment. Even in the case where R is a “smooth’’ (i.e., sufficiently 
differentiable) closed surface in Euclidean 3-space, this result is not amenable 
to the methods of elementary differential geometry, which ordinarily deals 
only with the case of “neighboring”’ points 4 and B. 


Problem 12. Let @ be the set of all curves in a given metric space R. 
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Define the distance between two curves I’,, [, € @ by the formula 
o(l, i) = inf (1; fe), (4) 


where 0 is the metric (3) in the space C(/, R), and the greatest lower bound 
is taken over all possible representations 


P=f(t) (@O<t<1) (5) 
of I’, and 
P=f,t) (@<t< 1) (6) 


of I',. Prove that the metric 9 makes @ into a metric space. 


Comment. The fact that 6([',, [',) = 0 implies the identity of [, and T, 
follows from the (not very easily proved) fact that the greatest lower bound 
in (4) is achieved for a suitable choice of the parametric representations (5) 
and (6). 


4 


LINEAR SPACES 


13. Basic Concepts 


13.1. Definitions and examples. One of the most important concepts in 
mathematics is that of a Jinear space, which will play a key role in the rest 
of this book: 


DEFINITION 1. A nonempty set L of elements x, y, z,... is said to be a 
lineay space (or vector space) if it satisfies the following three axioms: 


1) Any two elements x,y EL uniquely determine a third element 

x +ye€EL, called the sum of x and y, such that 

ay x +y=y+x (commutativity); 

b) (X¥+y)+2=x+( + 2) (associativity); 

c) There exists an element 0 € L, called the zero element, with the 
property that x + 0 = x for every x € L; 

d) For every x € L there exists an element —x, called the negative 
of x, with the property that x + (—x) = 0; 


2) Any number « and any element x € L uniquely determine an element 
ax € L, called the product of « and x, such that 
a) «(Bx) = (aB)x; 
b) lx =x; 


3) The operations of addition and multiplication obey two distributive 
laws: 
a) (a + B)x = aa + Bx; 
b) a(x + y) = ax + ay. 
118 
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Remark. The elements of L are called “‘points’’ or “‘vectors,’’ while the 
numbers «, 8, ... are often called “scalars.’’ If « is an arbitrary real number, 
L is called a real linear space, while if « is an arbitrary complex number, L 
is called a complex linear space.’ Unless the contrary is explicitly stated, the 
considerations that follow will be valid for both real and complex spaces. 
Clearly, any complex linear space reduces to a real linear space if we allow 
vectors to be multiplied by real numbers only. 


We now give some examples of linear spaces, leaving it to the reader 
to verify in detail that the conditions in Definition 1 are satisfied in each case.? 


Example 1. The real line (the set of all real numbers) with the usual 
arithmetic operations of addition and multiplication is a linear space. 


Example 2. The set of all ordered n-tuples 
x = (X45 asks a <x) 
of real or complex numbers xj, X,... , X,, With sums and “scalar multiples” 


defined by the formulas 


(x1, X25 sre). 8 > Xn) 2 i> Ya, se > Yn) pe (x, + Vi; Xo + Ye, oo Xn + Yn)> 


A(X, NXg, +05 Xn) = (ax, ONey 5.4 940XQ)s 


is also a linear space. This space is called n-dimensional (vector) space, or 
simply n-space, denoted by R” in the real case and C” in the complex case. 
(Concerning the precise meaning of the term ‘‘n-dimensional,” see Sec. 
13.2.) 


Example 3. The set of all (real or complex) functions continuous on an 
interval [a, 6], with the usual operations of addition of functions and multi- 
plication of functions by numbers, forms a linear space C,,,), one of the 
most important spaces in analysis. 


Example 4, The set /, of all infinite sequences 


MS (iy Xho se hee) (1) 
of real or complex numbers %,, X%9,...,%,,... satisfying the convergence 
condition 

fe 8) 
> |x,l” < @, 
k=1 


1 More generally, one can consider linear spaces over an arbitrary field. 

*It will be noted that certain symbols like R*, Cp, 43, /2 and m are used here with 
somewhat different meanings than in Sec. 5.1. The point is that there is no metric here, 
at least for the time being, while on the other hand, sums and scalar multiples of vectors 
were not defined in Chaps. 2 and 3. 
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equipped with operations 
(Ris Niiek 6 hi cee) Oi Vet ces Views) 
= (%1 + Yi, Xe + Vos. ++ 5 X— + Ves +++) 
Ch ON verses GaN pee 3.2) SS (ON ON aya 3 ON pee ha), (2) 
is a linear space. The fact that 


co 0) 


> [el < , 
k=1 k 


lyxl? < 00 


M8 


1 


implies 
< 2 
> [Xe + Yel” < 20 
k=1 
is an immediate consequence of the elementary inequality 


(%_ + Ve)” < 2x + Ve) 

Example 5. Let c be the set of all convergent sequences (1), cg the set of 
all sequences (1) converging to zero, m the set of all bounded sequences, 
and R® the set of all sequences (1). Thenc, cy, mand R® are all linear spaces, 
provided that in each case addition of sequences and multiplication of 
Sequences by numbers are defined by (2). 


Since linear spaces are defined in terms of two operations, addition 
of elements and multiplication of elements by numbers, it is natural to 
introduce 


DEFINITION 2. Two linear spaces L and L* are said to be isomorphic if 
there is a one-to-one correspondence x<> x* between L and L* which 
preserves operations, in the sense that 


xo x*, yoo y* 
(where x, yEL, x*, y* € L*) implies 


x + yo x*® + y* 


and 
OX <> ax* 
(a an arbitrary number). 


Remark. It is sometimes convenient to regard isomorphic linear spaces 
as different “realizations” of one and the same linear space. 


13.2. Linear dependence. We say that the elements x, y,..., w of a linear 


space L are linearly dependent if there exist numbers a, 8, ... , A, not all zero, 
such that® 


ax + By +--++dAw = 0. (3) 


* The left-hand side of (3) is called a linear combination of the elements x, y,..., w. 
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If no such numbers exist, the elements x, y,..., w are said to be Jinearly 
independent. In other words, the elements x, y,...,w are linearly inde- 
pendent if and only if (3) implies 

a=AB=e+-=A=—0. 


More generally, the elements x, y, ... belonging to some infinite set Ec L 
are said to be linearly independent if the elements belonging to every finite 
subset of E are linearly independent. 

A linear space L is said to be n-dimensional (or of dimension n) if n linearly 
independent elements can be found in L, but any n + 1 elements of L are 
linearly dependent. Suppose n linearly independent elements can be found 
in L for every n. Then L is said to be infinite-dimensional, but otherwise L 
is said to be finite-dimensional. Any set of n linearly independent elements of 
an n-dimensional space L is called a basis in L. 


Remark. The typical course on linear algebra deals with finite-dimensional 
linear spaces. Here, however, we will be primarily concerned with infinite- 
dimensional spaces, the case of greater interest from the standpoint of 
mathematical analysis. 


13.3. Subspaces. Given a nonempty subset L’ of a linear space L, suppose 
L is itself a linear space with respect to the operations of addition and multi- 
plication defined in L. Then L’ is said to be a subspace (of L). In other 
words, we say that L’ < Lis asubspaceifx € L’, ye L’ implies ax + By € L’ 
for arbitrary «and 8. The “‘trivial space’”’ consisting of the zero element alone 
is a subspace of every linear space L. At the opposite extreme, L can always 
be regarded as a subset of itself. By a proper subspace of a linear space L, 
we mean a subspace which is distinct from L itself and contains at least 
one nonzero element. 


Example 1. Let L be any linear space, and x any nonzero element of L. 
Then the set {Ax} of all scalar multiples of x, where 4 ranges over all (real or 
complex) numbers is obviously a one-dimensional subspace of L, in fact a 
proper subspace if the dimension of L exceeds 1. 


Example 2. The set P,, y; of all polynomials on [a, b] is a proper subspace 
of the set C,, ,; of all continuous functions on [a, b]. Like C,,,,) itself, Pro, o} 
is infinite-dimensional. At the same time, C,,,,) is itself a proper subspace of 
the set of all functions on [a, 5], both continuous and discontinuous. 


Example 3. Each of the linear spaces /;, cy, c, m and R® (in that order) 
is a proper subspace of the next one. 


Given a linear space L, let {x,} be any nonempty set of elements x, € L. 
Then L has a smallest subspace (possibly L itself) containing {x,}.4 In fact, 


* Here we use curly brackets in the same way as in footnote 10, p. 92. 
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there is at least one such subspace, namely L itself. Moreover, it is clear 
that the intersection of any system of subspaces {L,} is itself a subspace, 
since if L* = f) L, and x,.y € L*, then ax + By € L* for all « and 8 (why 2). 


Y 
The smallest subspace of L containing the set {x,} is then just the intersection 
of all subspaces containing {x,}. This minimal subspace, denoted by L({x,}), 
is called the (/inear) subspace generated by {x,\ or the linear hull of {x,}. 


13.4. Factor spaces. Let L be a linear space and L’ a subspace of L. 
Then two elements x, y€L are said to belong to the same (residue) class 
generated by L' if the difference x — y belongs to L’. The set of all such 
classes is called the factor space (or quotient space) of L relative to L’, denoted 
by L/L’. The operations of addition of elements and multiplication of elements 
by numbers can be introduced in a factor space L/L’ in the following natural 
way: Given two elements of L/L’, i.e., two classes & and y, we choose a 
“representative” from each class, say x from & and y from y. We then 
define the sum & + » of the classes € and 7 to be the class containing the 
element x + y, while the product «& of the number « and the class € is 
defined to be the class containing the element ax. Here we rely on the fact 
that the classes § + 7 and «& are independent of the choice of the “repre- 
sentatives’? x and y (why ?). 


THEOREM 1. Every factor space L/L’, with operations defined in the 
way just described, is a linear space. 


Proof. We need only verify that L/L’ satisfies the three axioms in 
Definition 1. This is almost trivial (give the details). J 


Let L be a linear space and L’ a subspace of L. Then the dimension of 
the factor space L/L’ is called the codimension of L’ in L. 


THEOREM 2. Let L’ be a subspace of a linear space L. Then L' has finite 
codimension n if and only if there are linearly independent elements x1, ... , 
x, in L such that every element x € L has a unique representation of the 
form 

X= Xz + °° + a,X, + YY; (4) 


where a,,...,%, are numbers and ye L’. 

Proof. Suppose every element x € L has a unique representation of the 
form (4). Given any class § € L/L’, let x be any element of &, and let 
E,, be the class containing x; (k = 1,...,). Then (4) clearly implies 


E= a6 +°°° + 4,8). 


Hence &,,..., &, is a basis for L/L’ (the linear independence of &,,..., 
é_, follows from that of x,,...,%,). In other words, L/L’ has dimension 
n, or equivalently L’ has codimension n. 
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Conversely, suppose L’ has codimension n, so that L/L’ has dimension 
n. Then L/L’ has a basis &,,..., &,. Given any x € L, let & be the class 
in L/L’ containing x. Then 


6=mb,+-°'°+4,&, 
for suitable numbers «,,...,,. But this means that every element in 
€, in particular x, differs only by an element y € L’ from a linear com- 
bination of elements x,,...,x, Where x, is any fixed element of 
6.(K 15543) Mes 
X= HX tot ax%,ty (WeL’ (5) 
(the linear independence of x,,... , x, follows from that of &,..., &,,). 
Suppose there is another such representation 
x= xX, tooo tax,+y (y’ EL). (5’) 
Then, subtracting (5’) from (5), we get 
0 = (a, — 03) + s+ + (a, — on) + y” (y" EL), 
and hence 
0= (oO an a )Ea = ee (On xe Gnd Sns 


where in the last equation 0 means the class containing the zero element 
of L, ie., the space L’ itself. But &,,..., €, are linearly independent, 
and hence a, = %,...,%,=a,. 


13.5, Linear functionals. A numerical function f defined on a linear space 
L is called a functional (on L).° A functional fis said to be additive if 


fx +y) =f) +fO) 
for all x, y € L and homogeneous if 


f(x) = af (x) 
for every number a. A functional defined on a complex linear space is called 
conjugate-homogeneous if 


f (ax) = af (x) 


for every number a, where & is the complex conjugate of «. An additive 


5 The word ‘‘functional” has already been used in a somewhat different sense in Sec. 
12.1, where a functional means a real function defined on a function space (topological 
or metric). Later on, we will deal with linear spaces which are also metric spaces and 
have functions as their elements. The two uses of the word “‘functional’’ will then coincide 
(if we allow complex-valued functionals). 
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homogeneous functional is called a /inear functional, while an additive 
conjugate-homogeneous functional is called a conjugate-linear functional. 


Example 1. Let R” be: teal n-space, with elements x = (x,,...,x,), and 
let a = (a,,...,4,) be a fixed element of R”. Then 


n 
f%) => 4% 
k=1 
is a linear functional on R”. Similarly, 
n 
f(x) =) a, 
k=1 
is a conjugate-linear functional on complex n-space C”. 


Example 2. Consider the integral 


I(x) = [ x(t) dt, 


or more generally 
I(x) = |’x( (0) dt, 


where 9(f) is a fixed continuous function on [a, 5]. It follows at once from 
elementary properties of integrals that /(x) is a linear functional. Similarly, 
the integral 


Tos) = | x@ at, 


or more generally 


Kx) = | Xe at, 


is a conjugate-linear functional on C,, »). 


Example 3. Another kind of linear functional on the space C,, 4) is the 
functional 
5,,(*) = x(t), 
which assigns to each function x(t) €C,,,, its value at some fixed point 
ty) € [a, b]. In mathematical physics, particularly in. quantum mechanics, this 
functional is often written in the form 


5,,(x) = Poe — t,) dt, 


where 3(f — f) is a ‘‘fictitious’’ or “‘generalized’’ function, called the (Dirac) 
delta function, which equals zero everywhere except at t = 0 and has an 
integral equal to 1.6 As we will see in Sec. 20.3, the delta function can be 


6 Clearly, no “‘true’’ function can have these properties! 
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represented as the limit, in a suitable sense, of a sequence of “‘true”’ functions 
9,» each vanishing outside of some ¢,,-neighborhood of the point t = 0 and 
satisfying the condition 


[on(t) dt=1 
(ce, >~O0asn— oo). 
Example 4. Let n be a fixed positive integer, and let 
x= (is Ragevans eR da) 
be an arbitrary element of /,. Then 


Sr{X) = Xn 


is obviously a linear functional on /,. The same functional can be defined 
on other spaces whose elements are sequences, e.g., on the spaces Cy, c, m 
and R® considered in Example 5, p. 120. 


13.6. The null space of a functional. Hyperplanes. Let f be a linear func- 
tional defined on a linear space L. Then the set L, of all elements x € L such 
that 

f(x) =0 
is called the null space of f. It will be assumed that fis nontrivial, i.e., that 
f(x) € 0 for at least one (and hence infinitely many) x € L, so that the set 
L — L,is nonempty. Obviously L, is a subspace of L, since x, y € L, implies 


flax + By) = af (x) + FQ) = 9. 
THEOREM 3. Let Xp be any fixed element of L — L,. Then every element 
x € L has a unique representation of the form 
xX = AXo +); 
where y € L,. 


Proof. Clearly f(%9) ~ 0, and in particular x, 4 0. There is no loss 
of generality in assuming that f(x,)) = 1, since otherwise we need only 
replace xy by Xo/ f(x»), noting that 


yYrxX— &Xq, 


Given any x € L, let 


where 
= f(x): 


Then y € L,, since 


SQ) =f(x% — ax) = f(x) — af (Xo) = f(x) —«=0. 


126 LINEAR SPACES CHAP. 4. 


Thus 
x= ax +y (yeLy). (6) 


Moreover, the representation (6) is unique. In fact, if there is another 
such representation 


Saude ier): (6’) 
then, subtracting (6’) from (6), we get 

(a == a')X5 = y’ = yy. 
If « = a’, then obviously y’ = y. On the other hand, if a ~ «’, then 


contrary to the choice of x. § 


COROLLARY 1. Two elements x, and x, belong to the same class gener- 
ated by L, if and only if f (x1) = f (Xe). 
Proof. It follows from 
x1 = f(%1)%0 + YM, 
Xg = f(X2)%o + Ve 
that 
Xy — X_ = (x1) — f(%2))%0 + 1 — V2): 


Hence x; — x, € L, if and only if the coefficient of x, vanishes. §f 


COROLLARY 2. L, has codimension 1. 


Proof. Given any class & generated by L,, let x be any element of & 
and choose f(x)x) = ax, as the “representative” of §. By Corollary 1, 
this representative is unique, and there is obviously a nonzero class 
since X) ~ 0 and f(x) 40 for some x € L. Moreover, given any two 
distinct classes § and y with representatives ax) and 6x, respectively, 
we have 


B(ax—) — a(Bxo) = 0 


BE — ayn = 0, 


where at least One of the numbers «, 8 is nonzero (why ?). Therefore any 
two distinct elements of L/L, are linearly dependent. It follows that 
L/L, is one-dimensional, i.e., L, has codimension |. J 


and hence 


COROLLARY 3. Two nontrivial linear functionals f and g with the same 
null space are proportional. 
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Proof. Again let xy be such that f(x9) = 1. Then g(x) ~ 0. In fact, 
x = f(x)% + y YelL,), 


g(x) = f(g (%) + ge) = f@)g(%), 


since L, = L,. But then g(x») = 0 would imply that g is trivial, contrary 
to hypothesis. It follows that 


8(x) = 8(%o) f (*), 


i.e., g(x) is proportional to f(x) with constant of proportionality g(x). 


and hence 


Given a linear space L, let L’ < L be any subspace of codimension 1. 
Then every class in L generated by L’ is called a hyperplane ‘‘parallel to L”’ 
(in particular, L’ itself is a hyperplane containing 0, i.e., “going through the 
origin’). In other words, a hyperplane M’ parallel to a subspace L’ is the 
set obtained by subjecting L’ to the parallel displacement (or shift) determined 
by the vector x, € L, so that’ 


M=L'+x,={x:x=x+y,yeLl}. 
It is clear that M’ = L’ if and only if x) ¢L’. We can now give a simple 
geometric interpretation of linear functionals: 


THEOREM 4. Given a linear space L, let f be a nontrivial linear functional 
on L. Then the set M, = {x:f(x) = 1} is a hyperplane parallel to the null 
space L, of the functional. Conversely, let M' = L’ + Xo (Xo ¢ L’) be any 
hyperplane parallel to a subspace L’ — L of codimension | and not passing 
through the origin. Then there exists a unique linear functional f on L such 
that M' = {x:f (x) = 1}. 


Proof. Given f, let x» be such that f(x 9) = 1 (such an x, can always 
be found). Then, by Theorem 3, every vector x € M, can be represented 
in the form x = xX, + y, where y € L,. 

Conversely, given M’ = L’ + x9 (Xo € L’), it follows from Theorem 2 
and its proof that every element x € L can be uniquely represented in the 
form x = ax, + y, where ye L’. Setting f(x) = a, we get the desired 
linear functional. The uniqueness of f follows from the fact that if 
g(x) = 1 for x e M’, then g(y) = 0 for y € L’ (why ?), so that 


g(ax9 + y)=a=—f(ax+y). FE 


Remark. Thus we have established a one-to-one correspondence be- 
tween the set of all nontrivial linear functionals on LZ and the set of all 
hyperplanes in L which do not pass through the origin. 


? The expression on the right is shorthand for the set of all x such that x = xp + y, 
y€L’ (the colon is read ‘‘such that’). Similarly, {x : f(x) = 1} is the set of all x such that 
f(x) = 1, and so on, 
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Problem 1. Prove that the set of all polynomials of degree n — 1 with 
real (or complex) coefficients is a linear space, isomorphic to the n-dimensional 
vector space R” (or C”). 


Problem 2. Verify that R" and C” are n-dimensional, as anticipated by the 
terminology in Example 2, p. 119. 


Problem 3. Verify that the spaces C,,,), /2, ¢, Co, m and R® are all 
infinite-dimensional. 


Problem 4. Given a linear space L, a set {x,} of linearly independent 
elements of L is said to be a Hamel basis (in L) if the linear subspace generated 
by {x,} coincides with L. Prove that 
a) Every linear space has a Hamel basis; 
b) If {x,} is a Hamel basis in L, then every vector x e L has a unique 
representation as a finite linear combination of vectors from the set 
{Xa}; 

c) Any two Hamel bases in a linear space L have the same power 
(cardinal number), called the algebraic dimension of L; 

d) Two linear spaces are isomorphic if and only if they have the same 

algebraic dimension. 


Problem 5. Let L’ be a k-dimensional subspace of an n-dimensional linear 
space L. Prove that the factor space L/L’ has dimension n — k. 


Problem 6. Let f, fi, ...,f,, be linear functionals on a linear space L such 
that f(x) = --- =f,(x) = 0 implies f(x) = 0. Prove that there exist con- 
stants a,,...,4, Such that 


fe) = Ya, fie) 


for every x € L. 


14. Convex Sets and Functionals. The Hahn-Banach Theorem 


14.1. Convex sets and bodies. Many important topics in the theory of 
linear spaces rely on the notion of convexity. This notion, stemming from 
intuitive geometric ideas, can be formulated purely analytically. Given a 
real linear space L, let x and y be any two points of L. Then by the (closed) 
segment in L joining x and y we mean the set of all points in L of the form 
ax + By where a, 8 > 0 and a + ®= 1. Such a segment minus its end 
points x and y is called an open segment. By the interior of a set M ¢ L, 
denoted by /(M), we mean the set of all points x € M with the following 
property: Given any ye L, there exists a number ¢ = e(y) > 0 such that 
x+tye Mif |t|<e. 
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DEFINITION 1. A set M © Lis said to be convex if whenever it contains 
two points x and y, it also contains the segment joining x and y. 


DEFINITION 2. A convex set is called a convex body if its interior is 
nonempty. 


Example 1. The cube, ball, tetrahedron and half-space are all convex 
bodies in three-dimensional Euclidean space R°. On the other hand, the 
line segment, plane and triangle are convex sets in R®, but not convex bodies. 

Example 2. As usual, let C,, ,, be the space of all functions continuous on 
the interval [a, 5], and let M be the subset of C,, ,, consisting of all functions 
satisfying the extra condition 


If@| <1. 
Then M is convex, since 
IMi<1, Ig@|<1 
together with a, 8 > 0, a + 8B = | implies 
lof (t) + Be <a +B=1. 


Example 3. The closed unit sphere in /., i.e., the set of all points x = 
(X1, Xo, +++ 5 Xn,---) Such that 


Cc 
>x*n < 1, 


is a convex body. Its interior consists of all points x = (x1, X2,... 5 Xns- ++) 
satisfying the condition 
00 
> x, <1: 
n=1 
Example 4. The Hilbert cube II (see Example 5, p. 98) is a convex set in 
/,, but not a convex body. In fact, 


1 
xn] < a (n= 1, 2,2.) 
if x ell. Let 
1 
= 1, 9 ’ peer Ty 
Yo 5 i 
and suppose x + ty, Ell, Le., 
t 1 
Xo i < gn-l : 
Then 
t t 1 1 1 
BCs ee Kale = 
n gi n | ES | | Qn-l za Qn-i Qr- 


for all n = 1,2,..., which implies t = 0. Therefore the interior of II is 
empty. 
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THEOREM 1. If M is a convex set, then so is its interior I(M). 


Proof. Suppose x, y €J(M), and let z = ax + By, a, B>0, a + 
68 = 1. Then, given any ae L, there are numbers ¢, > 0, e, > 0 such 
that the points x + 4a, y+ ta belong to M if |t,| < &, |tgl < e. 
Therefore 


a(x + ta) + B(y + ta) = z+ Ita 
belongs to M if |t| < ¢ = min {e,, eg}, 1.e.,zE1(M). J 


THEOREM 2. The intersection 
M=MM, 
a 
of any number of convex sets M,, is itself a convex set. 


Proof. Let x and y be any two points of M. Then x and y belong to 
every M,,, and hence so does the segment joining x and y. But then the 
segment joining x and y belongsto M. J 


Given any subset A of a linear space L, there is a smallest convex set 
containing A, 1.e., the intersection of all convex sets containing A (there 
is at least one convex set containing A, namely L itself). This minimal 
convex set containing A is called the convex hull of A. For example, the 
convex hull of three noncollinear points is the triangle with these points as 
vertices. 


14.2. Convex functionals. Next we introduce the important concept of a 
convex functional: 


DEFINITION 3. A functional p defined on a real linear space L is said to 
be convex if 


1) p(x) > 0 for all x € L (nonnegativity) ; 
2) p(ax) = ap(x) for all x € L and alla > 0; 
3) p(x + y) < p(x) + pQ) for all x, ye L. 


Remark. Here, unlike the case of linear functionals, we do not assume 
that p(x) is finite for all x € L, 1.e., we allow the case where p(x) = +0 
for some x € L. 


Example 1. The length of a vector in Euclidean n-space R” is a convex 
functional. The first and second conditions are immediate consequences of 
the definition of length in R” (length is inherently nonnegative), while the 
third condition means that the length of the sum of two vectors does not 
exceed the sum of their lengths (the triangle inequality). 
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Example 2. Let M be the space of bounded functions of x defined on some 
set S, and let s, be a fixed point of S. Then 


Ps, (*) i |x(so)| 
is a convex functional. 
Example 3. Let m be the space of bounded numerical sequences x = 
(X1, X%2,--.5X,,---). Then the functional 
p(x) = ly xxl 
is convex. 


14.3. The Minkowski functional. Next we consider the connection be- 
tween convex functionals and convex sets: 


THEOREM 3. Jf p is a convex functional on a linear space L and k is any 
positive number, then the set 


E = {x:p(x) < k} 
is convex. If p is finite, then E is a convex body with interior 
I(E) = {x:p(Qx) < k} 
(so that in particular 0 € I(E)). 
Proof. Ifx, yEE,a,B > 0,a+ 8 = 1, then 


plax + By) < ap(x) + BpQ) < k, 


i.c., Eis a convex set. Now suppose p is finite, and let p(x) < k, t > 0, 
yeL. Then 


p(x + ty) < p(x) + p(y). 
If p(—y) = p(y) = 0, then x + ty € E for all ¢. On the other hand, if at 
least one of the numbers p(y), p(—y) is nonzero, then x + ty € Eif 
kK— PO) 
max {p(y), p(—y)} 


Suppose we choose a definite value of k, say k = 1. Then every finite 
convex functional p uniquely determines a convex body F in L, such that 
0 € (E). Conversely, suppose E is a convex body whose interior contains 
the point 0, and consider the functional 


py(x) = inf fr: - eE,r> of, (1) 


called the Minkowski functional of the convex body E. Then we have 


THEOREM 4. The Minkowski functional (1) is finite and convex. 
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Proof. Given any x € L, the element x/r belongs to E if r is suffi- 
ciently large (why ?), and hence p,;(x) is nonnegative and finite. Clearly 
Px(0) = 0. If « > 0, then 


Py(ax) = inf f > 0: Ze kl = inf 0 > 0: ~e Ft 
r r 


= ainf fa > 0: ~é B| = apr). (2) 


Next, given any ¢ > O and any x,, x, € L, choose numbers r, (i = 1, 2) 
such that 
PalXi) <1 < Pa(%;) + €. 


Then x,/r,; € E. ifr =r, + re, then 


Xp Xe WiX%y , VeXe 
i | 
r rr, rls 


belongs to the segment with end points x,/r, and x./r,. Since £ is convex, 
this segment and hence the point (x, + x,)/r belongs to E. It follows that 


Pu(% + 2) < =r tre < Pe(%1) + pe(Xe) + 2 
or 
Pu (1 + 2) < Pe(%1) + Pe(%2), (3) 


since ¢ is arbitrary. Together (2) and (3) imply that pg(x) is convex. J 


13.4. The Hahn-Banach theorem. Given a real linear space L and any 
subspace Ly <— L, let fy be a linear functional defined on Ly. Then a linear 
functional f defined on the whole space L is said to be an extension of the 
functional fo if 

St (x) =fo(x) forall xe Lp. 


A problem frequently encountered in analysis is that of extending an arbitrary 
linear functional, originally defined on some subspace, onto a larger space. 
A central role in problems of this kind is played by 


THEOREM 5 (Hahn-Banach). Let p be a finite convex functional defined 
on a real linear space L, and let Ly be a subspace of L. Suppose fy is a 
linear functional on Ly satisfying the condition 


fo(x) < p(*) (4) 


on Ly. Then fy can be extended to a linear functional on L satisfying (4) 
on the whole space L. More exactly, there is a linear functional f defined 
on L and equal to fy at every point of Lo, such that f(x) < p(x) on L. 
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Proof. Suppose Ly ~ L, since otherwise the theorem is trivial. We 
begin by showing that fy can be extended onto a larger space L without 
violating the condition (4). Let z be any element of L — Lo, and let Z 
be the subspace generated by L, and the element z, i.e., the set of all linear 


combinations 
x +z (x € Ly). 


If fis to be an extension of fy onto Z, we must have 


F(x + tz) = folx) + tf) 
fx+t)= So(x) + te (5) 


after setting f(z) = c. We now choose ¢ such that the “‘majorization” 
condition f(x + tz) < p(x + tz) is satisfied, i.e., such that 


fo(x) + te < p(x + 12). 


We can write this condition as 


or 


aft) esol 
or 
Cx p(= +6) -4(*) (6) 
if ¢ > 0, and as 
afi) re> (9 
or 
so(-F-)-af) 


if t< 0. Hence we want to show that there is always a value of c satisfying 
(6) and (7). Let y’ and y” be arbitrary elements of Ly). Then it follows 
from the inequality 


fo") — fo") < pY” — ¥) = P(O”" +2) — 0" +2) 
< p(y” +2) +p(—y' — 2) 
that 
—foly") + pO" +2) > —foly’) — p(—y’ — 2). (8) 
Let 
c= sup [—foly’) — p(—y’ — z)], 


SEN IAN a PO ae) 


Then 
Cc > foal ’ 
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by (8) and the fact that y’ and y” are arbitrary. Hence, choosing c such 
that 


we find that the functional f defined on L by the formula (5) satisfies the 
condition f(x) < p(x). Thus we have succeeded in showing that if f, is 
defined on a subspace Ly ¢ L and satisfies (4) on Lo, then fo can be 
extended onto a larger subspace L with the condition (4) being preserved. 

To complete the proof, suppose first that L is generated by a countable 
set of elements x,, X2,...,X,,---.in L. Then we construct a functional 
on L by induction, i.e., by constructing a sequence of subspaces 


|B = {L, xy}, LM) — {LO, X29}, a a 


each contained in the next. Here {L™), x,,,,} denotes the minimal linear 
subspace of L containing L') and x,,,. This process extends the 
functional onto the whole space L, since every element x € L belongs to 
some subspace L), 

More generally, i.e., in the case where there is no countable set 
generating L, the theorem is proved by applying Zorn’s lemma (see 
p.28). The set F of all possible extensions of the functional fp satisfying 
the majorization condition (5) is partially ordered, and each Jinearly 
ordered subset A, < F has an upper bound. This upper bound is the 
functional which is defined on the union of the domains of all functionals 
fe Fy, and coincides with every such functional f on the domain of /- 
Hence, by Zorn’s lemma, A has a maximal element f. Clearly fmust be 
the desired functional extending f, onto L and satisfying the condition 
p(x) < f(x), since otherwise we could extend fin turn, by the method 
described above, from the proper subspace on which it is defined onto a 
large subspace, thereby contradicting the maximality of f/ Jj 


Next we turn to the case of complex linear spaces: 


DEFINITION 3’. A functional p defined on a complex linear space L is 
said to be convex if 

1) p(x) > 0 for all x € L (nonnegativity); 

2) p(ax) = |a| p(x) for all x € L and all complex «; 

3) p(x + y) < p(*) + p(y) for all x, ye L. 


The corresponding complex version of the Hahn-Banach theorem is 
given by 
THEOREM 5’. Let p be a finite convex functional, defined on a complex 


linear space L, and let Ly be a subspace of L. Suppose fy is a linear 
functional on Ly Satisfying the condition 


[fo(x)| < p(x) (4) 
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on Ly. Then fy can be extended to a linear functional on L satisfying (4’) 
on the whole space L. 


Proof. Let Lp and Loy denote the spaces ZL and Ly, regarded as real 
linear spaces. Clearly p is a finite convex functional on Lz, while 


Sor(x) = Re fo(x) 


is a real linear functional on Lo, satisfying the condition 


lfor(x)| < p(x) 


and hence (a fortiori) the condition 


Sor(x) < p(x). 


By Theorem 5, there exists a real linear functional fp defined on all of Lp, 
satisfying the conditions 


Tr(x) < p(x) if xeELe (= L), 
Tr(x) =for(%) if xe Lop (= Ly). 


—fr(x) = fr(—) < p(—) = p(x), 


lfe(x)| < p(x) if xeLp (= L). (9) 
We now define the functional 


I(x) = Sr (x) — ife(ix) 


on L, using the fact that L is a complex.linear space in which multipli- 
cation by complex numbers ts defined. It is easily verified that fis a com- 
plex linear functional on L such that 


SO) =folx) if xeL, 
Re f(x) = fr(x) if xe. 
Finally, to show that | f(x)| < p(x) for all x € L, suppose to the contrary 
that | f(xo)| > p(%o) for some x) & L. Writing f(x.) = ee’® where o > 0, 
we set yo = e~*?xq. Then 
fro) = Re f (vo) = Re [e“* f(%o)] = 9 > Po) = Po) 
which contradicts (9). J 


Clearly 


and hence 


14.5. Separation of convex sets in a linear space. Given a real linear space 
L, let M and N be two subsets of L. Then a linear functional f defined on 
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L is said to separate M and N if there exists a number C such that 
Oye C if xe M, 
fid<C if xeEN. 

It follows at once from this definition that 


1) A linear functional f separates two sets M and N if and only if it 
separates M — N = {z:z = x—y,xeEM,yeN}and {0}, ie., the set 
consisting of all differences x — y where x EM, ye WN and the set 
whose only element is 0 (note that the minus sign in M — N does not 
have the usual meaning of a set difference); 

2) A linear functional f separates two sets M and N if and only if it 
separates the sets M — x9 = {z:z = x — x, x © M} and N— x) = 
{z:z = y — x, yE N} for every xy € L. 


The following theorem on the separation of convex sets in a linear space 
has numerous applications and is an easy consequence of the Hahn-Banach 
theorem: 


THEOREM 6. Let M and N be two disjoint convex sets in a real linear 
space L, where at least one of the sets, say M, has a nonempty interior 
(i.e., isa convex body). Then there exists a nontrivial linear functional f on 
L separating M and L. 


Proof. There is no loss of generality in assuming that the point 0 
belongs to the interior of M, since otherwise we need only consider the 
sets M — xy = {2:2 = xX — X%, x EM} and N— x)= {z:z=y— Xp, 
y € N}, where x, is some point of the interior of M. Let yy be a point of 
N. Then the point —y,) belongs to the interior of the set M— N= 
{z:z=x—y,xeEM,yeEN}, and 0 belongs to the interior of the set 
M—N+y9 = (2:27 =x—y+yo,xE€M,yeEN}. Since M and N are 
disjoint, wehaveO ¢ M — N, yo ¢ M — N + yo. Let p be the Minkowski 
functional for the set M — N+ yo. Then p(y) > 1 since yp € M— N 
+ yo. Consider the linear functional 


folayo) = xp(Vo) 


defined on the one-dimensional subspace of L consisting of all elements 
of the form ayy. Clearly fo satisfies the condition 


So(%Vo) < P(%Vo)> 
since 
P(%o) = apo) if a> 0, 
while 
fol%Vo) = “fo(¥o) <O< play) if «<0. 


Hence, by the Hahn-Banach theorem, the functional fo can be extended 
to a linear functional f defined on the whole space L and satisfying the 
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condition f(y) < p(y) on L. It follows that f(y) < lifye M—N+ yo, 
while at the same time f(y») > 1, i.e., f separates the sets M — N + yo 
and {yo}. Therefore f separates the sets M — N and {0}. But then f 
separates the sets Mand N. § 


Problem 1. Let M be the set of all points x = (x1, x%2,...,X,,-.-)in kg 
satisfying the condition 
[e.0) 
Ynxe <1. 
n=l 


Prove that M is a convex set, but not a convex body. 


Problem 2. Give an example of two convex bodies whose intersection is 
not a convex body. 


Problem 3. We say that n + 1 points x,, xg, ..., X,4, in a linear space L 
are “in general position” if they do not belong to any (n — 1)-dimensional 
subspace of L. The convex hull of a set of n + 1 points x1, X2,..., Xn41 10 
general position is called an n-dimensional simplex, and the points x,, X2,...; 
Xn41 themselves are called the vertices of the simplex. Describe the zero- 
dimensional, one-dimensional, two-dimensional and_ three-dimensional 
simplexes in Euclidean three-space R*. Prove that the simplex with vertices 


Xy,X2,.++ X41 18 the set of all points in L which can be represented in the 
form 
nt+1 
x= > KX ns 
k=1 
where 
n+l 


ay 0, > 4, = 1. 
k=1 


Problem 4. Show that if the points x, x2,..., X,4, are in general position, 
then so are any k + 1 (k <n) of them. 


Comment. Hence the k +1 points generate a k-dimensional simplex, 
called a k-dimensional face of the n-dimensional simplex with vertices x, 
Ma eeg aaa. 

Problem 5. Describe all zero-dimensional, one-dimensional and two- 
dimensional faces of the tetrahedron in R* with vertices e,, 5, @3, 4. 


Problem 6. Show that in the Hahn-Banach theorem we can drop the 
condition that the functional p be finite. 


I5. Normed Linear Spaces 


15.1. Definitions and examples. Chapters 2 and 3 deal with topological 
(in particular, metric) spaces, i.e., spaces equipped with the notion of 
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closeness of elements, while Secs. 14 and 15 deal with linear spaces, i.e., 
spaces equipped with the operations of addition of elements and multipli- 
cation of elements by numbers. We now combine these two ideas, arriving at 
the notion of a topological linear space, equipped with a topology as well 
as with the algebraic operations characterizing a linear space. In this section 
and the next, we will study topological linear spaces of a particularly 
important type, namely normed linear spaces and Euclidean spaces. Topo- 
logical linear spaces in general will be considered in Sec. 17. 


DEFINITION 1. A functional p defined on a linear space L is said to be 
a norm (in L) if it has the following properties: 

a) p is finite and convex; 

b) p(x) = 0 only if x = 0; 

c) p(ax) = |a| p(x) for all x € L and all «. 


Recalling the definition of a convex functional, we see that a norm in 
L is a finite functional on L such that 


1) p(x) > 0 for all x € L, where p(x) = 0 if and only if x = 0; 
2) p(ax) = |«| p(x) for all x € L and all «; 
3) p(x + y) < p(x) + pQ) for all x, y EL. 


DEFINITION 2. A linear space L, equipped with a norm p(x) = ||x||, is 
called a normed linear space. 


The notation ||x|| will henceforth be preferred for the norm of the element 
x € L. In terms of this notation, properties 1)—3) take the form: 


1’) |x|] > O for all x € L, where ||x|| = 0 if and only if x = 0; 
2") |jax|| = la] |x|] for all x €¢ LZ and all «; 
3’) Triangle inequality: \|x + y|| < ||x|l + |ly|| for all x, y € L. 


Every normed linear space L becomes a metric space if we set 


o(x, y) = lx — yl (1) 


for arbitrary x, ye L. The fact that (1) is a metric follows at once from 
properties 1’)-3’). Thus everything said about metric spaces in Chap. 2 
carries over to the case of normed linear spaces. 

Many of the spaces considered in Chap. 2 as examples of metric spaces 
(or in Sec. 13 as examples of linear spaces) can be made into normed linear 
spaces in a natural way, as shown by the following examples (in each case, 
verify that the norm has all the required properties): 


® One of the pioneer workers in this field was Stefan Banach (1892-1945), author of 
the classic Théorie des Opérations Linéaires, reprinted by Chelsea Publishing Co., New 
York (1955). 
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Example 1. The real line Rt becomes a normed linear space if we set 
|| «|| = |x| for every number x € R’. 


Example 2. To make real n-space R" into a normed linear space, we set 


Ix = J 32 


for every element x = (X,, %2,...,X,) in R”. The formula 


e(x, y) = |x — yl = Sc =)" 
then defines the same metric in R” as already considered in Example 3, p. 38. 


Example 3. We can also equip real n-space with the norm 


Ixll, = 2 |i (2) 
=1 
or the norm 
|x| = max |x,|. (3) 


PKPSN 


The corresponding metrics lead to the spaces R? and R® considered in Ex- 
amples 4 and 5, p. 39. 


Example 4. The formula 


n 


|x|] = | Sst 


k=1 


introduces a norm in complex n-space C”. Other possible norms in C” are 
given by (2) and (3). 


Example 5. The space C,, ,, of all functions continuous on the interval 
[a, b] can be equipped with the norm 


I fll =max fC). 


- 


The metric space corresponding to this norm has already been considered in 
Example 6, p. 39. 
Example 6. Let m be the space of all bounded numerical sequences 
KS Oets Koy 8 ks Me eee) 
and let 
Ix] = sup [zl (4) 


Then (4) obviously has all the properties of a norm. The metric ‘“‘induced’’ 
by this norm is the same as that considered in Example 9, p. 41. 
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Example 7. A complete normed linear space, relative to the metric (1), is 
called a Banach space. It is easy to see that the spaces in Examples 1-6 are 
all Banach spaces (the details are left as an exercise). 


15.2. Subspaces of a normed linear space. In Sec. 13.3 we defined a 
subspace of a linear space L (unequipped with any topology) as a nonempty 
set Ly with the property that if x, ye Lo, then «x + Bye L, for arbitrary « 
and §. The subspaces of greatest interest in a normed linear space are the 
closed subspaces, 1.e., those containing all their limit points. In the case of an 
infinite-dimensional space, it is easy to give examples of subspaces that are 
not closed:? 


Example 1. In the space of all bounded sequences, the sequences with 
only finitely many nonzero terms form a subspace, but not a closed subspace, 
since, for example, the closure of the subspace contains the sequence 


(Loge) 
2 n 


Example 2. The set Pray, of all polynomials defined on the interval [a, 5] 
is a subspace of C,,,,;, but obviously not a closed subspace. On the other 
hand, the closure of P,,,,) coincides with C,,,), since every function con- 
tinuous on [a, 5] is the limit of a uniformly convergent sequence of poly- 
nomials, by Weierstrass’ approximation theorem."° 


In what follows, we will be concerned as a rule with closed subspaces. 
Hence it is natural to modify somewhat the terminology adopted in Sec. 13.3, 
i.e., by a subspace of a normed normed linear space we will always mean a 
closed subspace. In particular, by the subspace generated by a set of elements 
{x,} we will always mean the smallest closed subspace containing {x,}. This 
subspace will also be called the Jinear closure of {x,}. The term linear manifold 
will be reserved for a set of elements L, (not necessarily closed) such that 
x,y €L, implies «x + By EL, for arbitrary numbers « and 6. A set of 
elements {x,} in a normed linear space L is said to be complete (in L) if the 
linear closure of {x,} coincides with L. 


Remark. This is another meaning of the word “closed,” not to be confused 
with its meaning in Sec. 6.4. The context will always make it clear which 
meaning is intended. 


Example 3. By Weierstrass’ approximation theorem, the set of functions 
1,t,¢?,...,2",...iS complete in C,, 5). 


® This contingency cannot arise in a finite-dimensional subspace (see Problem 5a). 
10 See e.g.,G. P. Tolstov, Fourier Series (translated by R. A. Silverman), Prentice-Hall, 
Inc., Englewood Cliffs, N.J. (1962), p. 120. 
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Problem I. A subset M of a normed linear space R is said to be bounded 
if there is a constant C such that ||x|| < C for all x € M. Reconcile this with 
Problem 5, p. 65. 


Problem 2. Given a Banach space R, let {B,} be a nested sequence of 
closed spheres in R (so that By > B, > ---> B, >--+-). Prove that (1) B, 


is nonempty (it is not assumed that the radius of B, approaches 0 as n oo). 
Give an example of a nested sequence {E,,} of nonempty closed bounded 
convex sets in a Banach space R such that f) £, is empty (cf. Problem 6, 
p. 66). . 


Problem 3. Prove that the algebraic dimension (defined in Problem 4c, 
p. 128) of an infinite-dimensional Banach space is uncountable. 


Problem 4. Let R be a Banach space, and let M be a closed subspace of R. 
Define a norm in the factor space P = R/M by setting 


[Sl] = inf ||] 
LEE 


for every element (residue class) § € P. Prove that 
a) ||&|| is actually a norm in P; 
b) The space P, equipped with this norm, is a Banach space. 


Problem 5. Let R be a normed linear space. Prove that 

a) Every finite-dimensional linear subspace of R is closed; 

b) If M is a closed subspace of R and N a finite-dimensional subspace 
of R, then the set 


M+N={ziz=x+y,xEeM,yeN} (5) 


is a closed subspace of R; 
c) If Q is an open convex set in R and x, ¢ Q, then there exists a closed 
hyperplane which passes through the point x, and does not intersect Q. 


Problem 6. Let x = (%, X2,...,X,,...) be an arbitrary element of h. 
Prove that /, is a normed linear space when equipped with the norm 


<8 
|x|] = xe 
k=1 


Give an example of two closed linear subspaces M and N of /, whose “linear 
sum’’ M + N is not closed. 


Problem 7. Two norms ||°|li, |]*{[, in a linear space R are said to be 
equivalent if there exist constants a, b > O such that 


a [xll1 < [[xll2< 5 iIxll, 
for all x € R. Prove that if R is finite-dimensional, then any two norms in 
R are equivalent. 
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16. Euclidean Spaces 


16.1. Scalar products. Orthogonality and bases. We begin with two key 
definitions: 


DEFINITION 1. By a scalar product in a real linear space R is meant a 
real function defined for every pair of elements x, y € R and denoted by 
(x, y), with the following properties: 

1) (x, x) > O where (x, x) = 0 if and only if x = 0; 

2) (x, y) = Y, x); 

3) (Ax, y) = A(X, ¥)3 

4) (x y+z)= (x,y) + (x, Z) 

(valid for all x, y, z € R and all real }). 

DEFINITION 2. A linear space R equipped witha scalar product is called 

a Euclidean space. 


LEMMA. Any two elements x, y of a Euclidean space R satisfy the 
Schwarz inequality 
Ix, y)| < Ixll Ill, (1) 
where 
Ixi=V@,x, ty = V0.9). 
Proof. The quadratic polynomial 
P(A) = Ax + y, Ax + y) = A(x, x) + 2A(x, Y) + YY) 
= |[x]]? 7 + 2, yr + lvl? 
is obviously nonnegative. Therefore 
(x, y)? — [lxll? yl? < 0, (2) 
since otherwise (A) would become negative for some A (why ?). But (2) 
is equivalent to (1). J 
We now use the scalar product in a Euclidean space R to introduce a 
norm in R: 
THEOREM |. A Euclidean space R becomes a normed linear space when 
equipped with the norm 
IxI=V@.x) (eR). 
Proof. Properties 1°) and 2’) on p. 138 are immediate consequences 


of the definition of a scalar product. To prove property 3’), i.e., the 
triangle inequality, we note that 


Ix + yl? =e +y,x+y) =(%, x) +20,y)+ 0,9) 


< (x, x) + 2 |(x, y)] + yy) 
< [fl]? + 2 lll dy + ty l* = Civil + IylD?, 
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because of the Schwarz inequality (1), and hence 


Ix + yl < Ix +Iyl- Oi 


The scalar product in R can be used to define the angle between two 
vectors as well as the length (i.e., norm) of a vector: 


DEFINITION 3. Given any two vectors x and y in a Euclidean space R, 

the quantity 8 defined by the formula 
9 — 

IaH ty 


is called the angle between x and y. 


0<0< 7) (3) 


Remark. It follows from Schwarz’s inequality (1) that the right-hand 
side of (3) cannot exceed 1. Therefore, given any x and y, (3) actually 
determines a unique angle in the interval [0, 7]. 


Suppose (x, y) = 0, so that (3) implies 6 == 7/2. Then the vectors x and y 
are said to be orthogonal. A set of nonzero vectors {x,} in R is said to be 
an orthogonal system if 


(Xu Xe) =0 for a6 
and an orthonormal system if 
0 for «8, 
(Xa, Xa) = 1 


for «= 68. 


If {x,} is an orthogonal system, then clearly 


i 


THEOREM 2. The vectors in an orthogonal system {x,} are linearly 
independent. 


is an orthonormal system. 


Proof. Suppose 
CiXg or Cokes oe ake, = 0. 
Then, taking the scalar product with x,,, we get 
(%a;s C1% 0 + aX ag FH CnXag) = Ci(%ajr Xa,) = 9, 
by the orthogonality of {x,}. But (x,,, x.,) #0, and hence 
c,; =0 G10 7.26.40)5 ff 


An orthogonal system {x,} is called an orthogonal basis if it is complete, 
i.e., if the smallest closed subspace containing {x,} is the whole space R. 
Similarly, a complete orthonormal system is called an orthonormal basis. 
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16.2. Examples. We now give some examples of Euclidean spaces and 
orthogonal bases in them: 


Example 1. Let R” be real n-space, i.e., the set of all ordered n-tuples 


X = (X1, Xe,---5Xn)s V = 1, Va.+ ++ > Vn)»>+++» equipped with the same 
algebraic operations as in Example 2, p. 119. Using the formula 


(x, =D xe (4) 


to define a scalar product in R”, we get Euclidean n-space™ The corre- 
sponding norm and distance in R” are 


le =f Dd xy 
k=1 
and 
e(x, y) = lx — yl = J 2% — y,)”. (5) 
The vectors 
ey) = (1, 0, 0, as , 0), 
e, = (0, 1,0,...,0), 


ee 88 ®© e@ ®@ @e@ e® e@ e868 @® e@ @ 


form an orthonormal basis in R”, one of infinitely many such bases. 


Example 2. The space /, with elements x = (x1, X2,...,Xy.--+)) Y= 
(Vis Vas oe Vives .),...+, Where 


[> 8) [e.@] 

2 2 
i OO, yn < 2 ara 
k=1 k=1 


becomes an infinite-dimensional Euclidean space when equipped with the 
scalar product 


(x, y) = Saye (6) 


The convergence of the right-hand side of (6) follows from the elementary 
inequality 

[xe Viel < (lel + lyel)? < 20% + Ye) 
and it is an easy matter to verify that (6) has all the properties of a scalar 


11 The term “‘Euclidean n-space” has already been used in Example 3, p. 38 to describe 
the metric space with distance (5). In so doing, we anticipated the eventual introduction of 
the scalar product (4). 
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product. The simplest orthonormal basis in /, consists of the vectors 


e, = (1, 0,0,...), 
e, = (0, 1,0,.. ); 
é3 = (0, 0, 1, ), (7) 


ee e© © @ e© @© @ e# e® oe @ 


The orthonormality of the system (7) is obvious. As for the completeness 
of the system, given any vector x = (X,, X2,...,X,,...) in hy, let 


ON Sis Mase gs apy Os Ose): 


Then x‘) is a linear combination of the vectors e,, é:,...,¢, and 
|x — x|| +Oask > ~. 


Example 3. The space C7, ,, consisting of all continuous functions on 


[a, b] equipped with the scalar product 


(fa) =| "Fao at 


is another example of a Euclidean space. Among the various orthogonal 
bases in Cj, »}, one of the most important is the system of trigonometric 
functions 


is. COR Se SO (8) 
b—a b—a 
The orthogonality of this system can be verified by a simple calculation. 
Making the choice a = —n, b = 7, we simplify (8) to 
1, cosnt, sinnt (n=1,2,...). (8’) 


Thus (8’) is an orthogonal basis in the space C?_,, ,;. As for the completeness, 
we have 


THEOREM 3. The system (8) is complete in Cj, ,.,. 


Proof. By another version of Weierstrass’ approximation theorem,” 
every function ¢ continuous on the interval [a@, b] and such that o(a) = 
p(b) is the limit of a uniformly convergent sequence of trigonometric 
polynomials, i.e., linear combinations of elements of the system (8). 

. This sequence converges (a fortiori) to ¢ in the norm of the space Cf, ,). 
But an arbitrary function fe Cf, ,, can be represented as the limit in the 


12 See e.g., G. P. Tolstov, op. cit., Corollary 1, p. 117. 
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C?,,,; norm of a sequence of functions {¢,,}, where 


T() iff a<x<b— 
Pn(X) = 


IN 


kit = *) = nf(a) |(b apegey it Ba . ae 


coincides with f in the interval 

[a, 6 — (1/n)], is linear on [b — (1/n), 5] 

\ and takes the same value at the point 

\ b as at the point a (see Figure 16). 

\ Hence every element of Cf, ,) can 

| be approximated arbitrarily closely 

to tin the CP, norm) by a linear 

combination of elements of the system 
FIGURE 16 (8). § 


16.3. Existence of an orthogonal basis. Orthogonalization. From now on, 
we will be mainly concerned with the case of separable Euclidean spaces, 
i.e., Euclidean spaces containing a countable everywhere dense subset. For 
example, the spaces R”, /, and Cf ,) are all separable, as shown in Sec. 6.3. 
An example of a nonseparable Euclidean space is given in Problem 2. 


THEOREM 4. Every orthogonal system {x,} in a separable Euclidean 
space R has no more than countably many elements x,. ~ 


Proof. There is no loss of generality in assuming that the system 
{x,} is orthonormal as well as orthogonal, since otherwise we need only 


replace {x,} by 
X¢ 
ee | 


Ixu—Xgl =V2 if a8. (9) 


Consider the set of open spheres S(x,, 4). These spheres are pairwise 
disjoint, because of (9). Moreover, each sphere contains at least one 
element from some countable subset {y,,} everywhere dense in R. Conse- 
quently there are no more than countably many such spheres, and hence 
no more than countably many elements x,. J 


We then have 


We have already exhibited an orthogonal basis in each of the spaces R”, 
I, and Cf, ,). The existence of an orthogonal basis in any separable Euclidean 
space is guaranteed by the following theorem and its corollary, analogous 
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to the theorem on the existence of an orthogonal basis in any finite-dimensional 
Euclidean space :18 


THEOREM 5 (Orthogonalization theorem). Let 


fe eters care are (10) 
be any (countable) set of linearly independent elements of a Euclidean 


space R. Then R contains a set of elements 


Oi Opiate gO aye we (11) 
such that 


1) The system (11) is orthonormal; 
2) Every element ¢,, is a linear combination 


Pn = Ayr; + Anofe oe i nad a (Gan F 0) 
of the elements fy, fo...» >fni3 


3) Every element f,, is a linear combination 


Sa = Om Dye @e Pe"? = Oa, (Ban F 9) 
of the elements $1, P2,--- 5 Pn: 
Moreover, every element of the system (10) is uniquely determined by these 


conditions to within a factor of +1. 


Proof. First we construct 9. Setting 


1 = aah; 
we determine a,, from the condition 
(91, 9) = anf fi) = |, 
which implies 
| 


Ging). 


This obviously determines 9, uniquely (except for sign). 

Next suppose elements 9), 92,..., Py» -4 Satisfying the conditions of 
the theorem have already been constructed. Then f,, can be written in the 
form 


ayy = 


28 
by 


he = bai Pr ae a ae Dp Peck zs hy, (12) 
where 
(An, )=0 (k=1,2,...,n—1). 


13 See e.g., G. E. Shilov, An Introduction to the Theory of Linear Spaces (translated by 
R. A. Silverman), Prentice-Hall, Inc., Englewood Cliffs, N.J. (1961), Theorem 28, p. 142. 
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In fact, the coefficients 5,, and hence the element h, are uniquely 
determined by the conditions 
(ins Px) = ¢F cee) ie Digi Pn As Px) 
= Cis Px) — Dax( Px Px) = 0, 


bar = (fa) (K = 1,2,...,2—1). 
Clearly (h,,,,) > 0, since (h,, h,) = 0 contradicts the assumed linear 
independence of the elements (10). Let 
— fn 
Vln ha) 
Using (12) and (13), we express /,, and hence 9,, in terms of the functions 
hes Cee rd or : 


Pn = An St ae Anote = ena 7 


1.€., 


(13) 


where 
ne SOE 7) 
Vhs An) 
Moreover 
(9,, P,) = 0 (K =1,2,...,n—1), 
(Pn> Pn) = 1 
and 
Ji = bn 9 =e bn2Pe = i aC 0 DinGx 
where 
Ban = V (hg, hy) > 0. 
Thus, starting from elements 9, 92,..., 9,1 Satisfying the conditions 
of the theorem, we have constructed elements 91, 92,-.., Qn_15 Pn 


satisfying the same conditions. The proof now follows by mathematical 
induction. § 


Remark. The process leading from the linearly independent elements (10) 
to the orthonormal system (11) is called orthogonalization. It is clear that 
the subspace generated by (10) coincides with that generated by (11). 
Hence the set (10) is complete if and only if the set (11) is complete. 


COROLLARY. Every separable Euclidean space R has a countable 
orthonormal basis. 


Proof. Let 41, V2,---5Vn,--. be a countable everywhere dense 
subset of R. Then a complete set of linearly independent elements f,, 
fos -++sfys+-.can be selected from {v,,}. In fact, we need only eliminate 
from the sequence {v,,} all elements |, which can be written as linear 
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combinations of elements |; with smaller indices (i < k). Applying the 
orthogonalization process to fj, fo,..-,/fn»..-, We get an orthonormal 
basis. § 


16.4. Bessel’s inequality. Closed orthogonal systems. Let ¢e,,e5,...,&, 
be an orthonormal basis in R”. Then every vector x € R” can be written in 
the form 


n 


x= > Cees 
k=1 
where 
Cy =— (x, e,). 


We now show how this generalizes to the case of an infinite-dimensional 
Euclidean space R. Let 0, 92,..., Qy,...- be an orthonormal system in 
R, and let f be an arbitrary element of R. Suppose that with { we associate 


1) The sequence of numbers 


C= (4%) (kK =1,2,...), (14) 
called the components or Fourier coefficients of f with respect to the 


system {9,}; 
2) The series 


S cpr (15) 


k=1 


(for the time being, purely formal), called the Fourier series of f with 
respect to the system {9,}. 


Then it is natural to ask whether the series (15) converges,!* and if so, 
whether the sum of the series coincides with the original function f/ To 
answer these questions, we first prove 


THEOREM 6. Given an orthonormal system 


1s Pars +s Deseee (16) 


in a Euclidean space R, let f be an arbitrary element of R. Then the 
expression 


n 
f~ 2d 44 Px 
k=1 
achieves its minimum for 


Ay, = Cy, = (Sf; 0») he Sls 2 icant): 


44 More exactly, whether the sequence of partial sums corresponding to (15) converges 
in the metric of R. 
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This minimum equals 


If? — De. 
k=1 
Moreover 
2% < If’, (17) 
k= 
a result known as Bessel’s inequality. 


Proof. Let 
Sn = 24 Pr: (18) 
K=1 


Then, by the orthonormality of (16), 
If Su? = (f—Larens— Laver] 
=(f,f) - 27, 219%] a (San7 > av91) 
k=1 k=1 [=1 


= IFIP — 2d ance + Dae 
k=1 k=1 
or 
If — Sall? = WFP — dee +B (ae — os (19) 


where 
Cy = (f, Px) (k= 1, 25208 ,5n) 


The expression in the right-hand side of (19) obviously achieves its mini- 
mum when its last term vanishes, i.e., when 


Ay = Cy (kt =1,2,...,n), 


and this minimum is just 


If— Sal? = IF — Seb (20) 
Moreover, since || f — S,,||* > 0, it follows from (20) that 
det < fl (21) 


for every n. Hence the series 
i°.@) 
> 6 
k=1 


is convergent. Taking the limit as n —> 00 in (21), we get (17). J 
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Remark. Geometrically, Bessel’s inequality (17) means that the sum of 
the squares of the projections of a vector fonto a set of mutually perpendicular 
directions cannot exceed the square of the length of the vector itself. For a 
geometric interpretation of the rest of Theorem 6, see Problems 5 and 6. 


The case where Bessel’s inequality becomes an equality is particularly 
important: 


DEFINITION 4. Suppose equality holds in (17) for every fER, i.e., 
Sup pose 


Se= fl" (22) 


for every fe R. Then the orthonormal system ©, 92,..- 5 Px, +» - is Said 
to be closed. 


Remark. This is another meaning of the word “closed,” not to be 
confused with its meaning in Sec. 6.4. The context will always make it 
clear which meaning is intended. Formula (22) is known as Parseval’s 
theorem. 


THEOREM 7. An orthonormal system 1 92,..., P,,...in a Euclidean 
space R is closed if and only if every element f € R is the sum of its Fourier 
Series. 


Proof. According to Definition 4, R is closed if and only if (22) holds 
for every fe R. Taking the limit as n — oo in (20) and using (18), we see 
that (22) holds for every f € R if and only if 


n> oO k=1 
or equivalently 
oo 
a = >, Ce Pres 
k=1 


foreveryfeR. J 


The properties of being complete and being closed are intimately connected, 
as shown by 


THEOREM 8. An orthonormal system 91, $o,... 5 Py,... ina Euclidean 
space R is complete if and only if it is closed. 


Proof. Suppose {9,} is closed. Then, by Theorem 7, every element 
f¢ Ris the limit of the partial sums of its Fourier series. In other words, 
linear combinations of elements of {9,} are everywhere dense in R, 
i.e., {p,} is complete. 
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Conversely, suppose {¢,} is complete. Then every element fe R can 
be approximated arbitrarily closely by a linear combination 


n 
> ay Pr 
k=l 


of elements of {¢,}. But the partial sum 


>: CyPx 


k=] 
of the Fourier series of fis at least as good an approximation. Hence f 
is the sum of its own Fourier series. It follows from Theorem 7 that 
{g,} is closed. § 


COROLLARY. Every separable Euclidean space R contains a closed 
orthonormal system 01, Qo, --+ > Puss 


Proof. An immediate consequence of Theorem 8 and the corollary 
to Theorem 5. § 


Example 1. The orthonormal system (7) is closed in &. 


Remark. In introducing the concepts of Fourier coefficients and Fourier 
series, we assumed that the system {9,} is orthonormal. More generally, 
suppose {¢,} is orthogonal but not orthonormal, and let 


-_ Px 
Ys leu 


Then the system {),} is orthonormal. Given any fe R, let 


Ce = (fs Yi) = e I —— (f, $x); 


and consider the series 


> Ch = > => 4, Pr» 
n=1 | Th k=l 

where 

Then the coefficients (23) are called the Fourier coefficients of the element 

FER with respect to the orthogonal (but not orthonormal) system {9;,}. 

Substituting c, = a, ||y,|| into (17), we get the following version of Bessel’s 

inequality for arbitrary orthogonal systems: 


xai Pell? < IS’. (17’) 
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If equality holds in (17’) for every fe R, the orthogonal system {¢,} is said 
to be closed, just as in Definition 4. 


Example 2. The orthogonal system (8) is closed in C2 ,,. 


16.5. Complete Euclidean spaces. The Riesz-Fischer theorem. Given a 
Euclidean space R, let {¢,,} be an orthonormal (but not necessarily complete) 
system in R. It follows from Bessel’s inequality that a necessary condition 
for the numbers c,, Cz,..., Cy,,.-. to be Fourier coefficients of an element 
feéR is that the series 


k=1 


converge. It turns out that this condition is also sufficient if R is complete, 
as shown by 


THEOREM 9 (Riesz-Fischer). Given an orthonormal system {¢,} in a 


complete Euclidean space R, let the numbers cy, Co, ..., Cy, .. be such 
that 
> ck (24) 
k=1 


converges. Then there exists an element fe R with c,, Co,... 4 Cyy-.. as 
its Fourier coefficients, i.e., such that 


Se = Il? 


k=1 


Cy = (Ff, Fx) (A = 1,2,...). 


where 


Proof. Writing 


nr 
Ti = > CrPx> 
k=1 
we have 


N+D 
2 
Ea= — f,|l’ = Cpa Oa oP ae Ce Dell” = > Cre 


k=n+1 


Hence f converges to some element fe R, by the convergence of (24) 
and the completeness of R. Moreover, 


(f; Px) = Pn» Px) + (f= Jas Px) (25) 


where the first term on the right equals c, if nm > k and the second term 
approaches zero as n — ©0, since 
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Taking the limit as n — 00 in (25), we get 
f; Px) = Cry 
since the left-hand side is independent of n. Moreover, 


as n — 00, and hence 


(s-dee/ —2 ee) = Sf) —Yci—+0 
k=1 k=1 k=1 
asn— oO, 1.e., 
lim ¥q=dLa=lsfl. 
n7>o k=1 k=1 
THEOREM 10. Let {9,} be an orthonormal system in a complete Eu- 


clidean space. Then {¢,} is complete if and only if R contains no nonzero 
element orthogonal to all the elements of {¢9;}. 


Proof. Suppose {¢,} is complete and hence closed (by Theorem 8), 
and suppose f is orthogonal to all the elements of {,}. Then all the 
Fourier coefficients of f vanish. Hence 


If = dee = 0 
k=1 


by the Riesz-Fischer theorem, i.e., f = 0. 
Conversely, suppose {9} is not complete. Then R contains an 
element g ~ 0 such that 


ell’ > 2b where c, = (8, %,) 


(why ?). By the Riesz-Fischer theorem, there exists an element fe R 
such that 


(Leto Ir =eb. 


But f — g is orthogonal to all the 9,, by construction. Moreover, it 
follows from 


ak = 2% < |lg]l’ 
that f—g40. 


16.6. Hilbert space. The isomorphism theorem. Continuing our study of 
complete Euclidean spaces, we concentrate our attention on infinite- 
dimensional spaces, since finite-dimensional spaces are considered in great 
detail in courses on linear algebra. 
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DEFINITION 5. By a Hilbert space’ is meant a Euclidean space which 
is complete, separable and infinite-dimensional. 


In other words, a Hilbert space is a set H of elements f, g,... of any 
kind such that 


1) H is a Euclidean space, i.e., a real linear space’® equipped with a 
scalar product; 

2) H is complete with respect to the metric e(f, g) = ||f — gl; 

3) H is separable, i.e., H contains a countable everywhere dense subset; 

4) H is infinite-dimensional, i.e., given any positive integer n, H contains 
n linearly independent elements. 


Example. The real space /, is a Hilbert space (check all the properties). 


DEFINITION 6. Two Euclidean spaces R and R* are said to be isomor- 
phic (to each other) if there is a one-to-one correspondence x<> x*, y<> y* 
between the elements of R and those of R* (x,y €R, x*, y* € R*) 
preserving linear operations and scalar products in the sense that” 


x+yox*+y*, axcoax*, (x, y) = (x*, y*). 


It is well known that any two n-dimensional Euclidean spaces are iso- 
morphic to each other, and in particular that every n-dimensional Euclidean 
space is isomorphic to the space R” of Example 1, p. 144.18 On the other 
hand, two infinite-dimensional Euclidean spaces need not be isomorphic. 
For example, the spaces /, and C7, ,; are not isomorphic, as can be seen from 
the fact that /, is complete while C7, ,, is not (recall Examples 4 and 5, 
p- 57). Nevertheless, for Hilbert spaces we have 


THEOREM 11 (Isomorphism theorem). Any two Hilbert spaces are 
isomorphic. 


Proof. The theorem will be proved once we manage to show that 
every Hilbert space H is isomorphic to /,. Let {9,} be any complete 
orthonormal system in H (such exists by the corollary to Theorem 5), 
and with every element fe H associate its Fourier coefficients {c,} with 


respect to {o,}. Since 
io.) 
> 4 < ©, 
=1 

15 Named after the celebrated German mathematician David Hilbert (1862-1943). 

16 However, see Sec. 16.9. 

17 Tsomorphism of two normed linear spaces R and R* is defined in the same way, 
except that preservation of scalar products is replaced by preservation of norms, i.e., by 
the condition ||x|| = ||x*|]. 

18 See e.g., G. E. Shilov, op. cit., Theorem 29, p. 144. 
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by Theorem 8, the sequence (c,, Co,...,c,,..-) belongs to /,. Con- 
versely, by the Riesz-Fischer theorem, to every element (c,, ¢2,..., 
Cy,---) in J, there corresponds an element fe H with the numbers c,, 
Co,-++5Cyy+-. aS its Fourier coefficients. This correspondence between 
the elements of H and those of /, is obviously one-to-one. Moreover, if 


FAG Case o 5 Cos he)s 
fo (Gy, G,...,&y---); 
then clearly 
FAI (C1 + Gy, Co + be... Ge + Ge. -)s 


af (ac1, Alo, 2 - » KCpy.s as 


i.e., Sums go into sums and scalar multiples into scalar multiples with the 
same factor. Finally, by Parseval’s theorem, 


GN=Leb GA=Leb 


00 


fA+26f + 6A =P +thft+h = da + 4° 


k=1 


ce co Le 0) 
= Ye + 25 e.g + > &, 
k=1 k=1 k=1 


and hence 
ie) 
f, ie ) = > CE egs 
k=1 
so that scalar products are preserved. [f 


Remark. Theorem 11 shows that to within an isomorphism, there is 
only one Hilbert space (i.e., only one space with the four properties listed 
above, and that this space has /, as its “coordinate realization,” just as 
the space of all ordered n-tuples of real numbers with the scalar product 


n 

> x,y, is the “coordinate realization’? of axiomatically defined Euclidean 
k=1 

n-space. 


16.7. Subspaces. Orthogonal complements and direct sums. In keeping 
with the terminology of Sec. 15.2, by a linear manifold in a Hilbert space H 
we mean a set L of elements of H such that f, g « H implies af + ®g € L for 
arbitrary numbers « and 8, while by a subspace of H we mean a closed linear 
manifold in H. 


Lemma. Ifa metric space R has a countable everywhere dense subset, 
then so does every subset R’ — R. 
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Proof. Let 
Gal Gop aiaue Ope ano 


be a countable everywhere dense subset of R, and let 
a, = inf e(b n> %)- 
neR 


Then, given any positive integers n and p, there is a point »,,, € R’ such 
that 


1 
ol (ae Ney) < ay ss D , 


Given any « > 0 and any n€ R’, let 


and choose n such that 


Then 
€ e ..:2€ 
3 - 3 3° 


— 
— 


1 
eens Nnv) Ee Fs << 


and hence e(%, jn») < ¢. In other words, R’ has an everywhere dense 
subset {7,,,} (”, p = 1, 2,...) containing no more than countably many 
elements. j 


THEOREM 12. Every subspace M of a Hilbert space H is either a (com- 
plete separable) Euclidean space or itself a Hilbert space. Moreover, M 
has an orthonormal basis, like H itself. 


Proof. The fact that M has properties 1) and 2) of Definition 5 is. 
obvious. The separability of M follows from the lemma. To construct an 
orthonormal basis in M, apply Theorem 5 to any countable everywhere 
dense subset of M@. § 


Subspaces of a Hilbert space H have certain special properties (not shared 
by subspaces of an arbitrary normed linear space), stemming from the 
presence of a scalar product in H and the associated concept of orthogonality: 


THEOREM 13. Let M be a subspace of a Hilbert space H, and let 
M=HOM 


denote the orthogonal complement of M, i.e., the set of all elements h' € H 
orthogonal to every he M. Then M’ is also a subspace of H. 
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Proof. The linearity of M’ is obvious, since 
a (hj, h) = (hi, h) = 0 
implies 
(a,hy + ah, h) = 0 
for arbitrary numbers «, and a». To show that M’ is closed, suppose 


{h,,} is a sequence of elements of 4/’ converting to h’. Then, given any 
heM, 


(h', h) = lim (hj, h) = 0, 
and hence h’c M’. § _ 
Remark. By definition, h’ « M’ if and only if h’ is orthogonal to every 
heM. But thenh € H if and only if 4 is orthogonal to every h’ ¢ M’. Hence 


M’ = H © M implies M = H © M’, and we can call M and M’ (mutually) 
orthogonal subspaces of H. 


THEOREM 14. Let M be a subspace of a Hilbert space H, and let 
M’ = HOM be the orthogonal complement of M. Then every element 
Jf © H has a unique representation of the form 


f=h ails (26) 
wherehe M,h'’ eM’. 


Proof. Given any fe H, let {p,} be an orthonormal basis in M, and 
let 
h = 2.6% Pis Ce = YS Px)- 
By Bessel’s inequality, 
Scr < o, 
=1 


and hence, by the Riesz-Fischer theorem, h exists and belongs to M. 
Let 


h' = f—h. 
Then obviously 
(h', Px) =0 


for all k, and since any element g € M can be represented in the form 


feo) 
gs = > ay, Pro 
k=1 
we have 


(h', g) = 2a om) = 0, 


i.e., 4° € M’. This proves the existence of the representation (26). 
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To prove the uniqueness of (26), suppose there is another represen- 
tation 
f=h+hi, 
where h, € M, hj € M’. Then 


(hy, Sy) = Ch Ox) = Cy 
for all k, and hence 


h=h, hi=h'. ¥ 


CorOLLarY 1. Every orthonormal system {¢,} in a Hilbert space H 
can be enlarged to give a complete orthonormal system in H. 


Proof. Let M be the linear closure of {¢,}, so that {p,} is complete 
in M. Let M' = H OM be the orthogonal complement of M, and let 
{,} be acomplete orthonormal system in M’ (such exists by Theorem 12, 
since M’ is a subspace). Recalling (26), we see that the union of {¢,} 
and {¢,} is a complete orthonormal system in H. 


COROLLARY 2. Let M be a subspace of a Hilbert space H, and let 
M’ = H © M. Then M' has codimension n if M has dimension n and 
dimension n if M has codimension n. 


Proof. An immediate consequence of the representation (26) and 
Theorem 2, p. 122. | 


Let M be a subspace of a Hilbert space H, with orthogonal complement 
M’' = H © M. If every vector fe H can be represented in the form 


f=h+h (heM,h'eM), 


we say that H is the direct sum of the orthogonal subspaces M and M’, and 
write 
H=M®M’. 


The concept of a direct sum generalizes at once to the case of any finite or 
even countable number of subspaces: Thus # is said to be the direct sum 
of the subspaces M,, M,,..., M,,... and we write 


if 
1) The subspaces M, are pairwise orthogonal, i.e., every element in M, 
is orthogonal to every element in M, whenever j 4 k; 
2) Every element f¢ H has a representation of the form 


fob th te th, te (27) 
where h, € H,, (n = 1,2,...). 
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It is easy to see that the representation (27) is unique if it exists and that 


IF? => WAyll? 
R=1 7 
(give the details). 

Besides direct sums of subspaces, we can also talk about direct sums of a 
finite or countable number of Hilbert spaces. Thus, given two Hilbert spaces 
H, and Ho, by the direct sum 

H = H,® 
is meant the set of all ordered pairs (Ay, h.) with h, € Ay, he € Hz, where 
linear operations and the scalar product in H are defined by 


(Ay, he) + (hy, ha) = (hy + hy, he + hy), 
a(hy, he) = (ah, ah), 
(C11, he), (Aa, h2)) = (a, hy) + (he, he). 
Consider the subspace of H consisting of all pairs of the form (/,, 0) and 
the subspace consisting of all pairs of the form (0, Ag). Then clearly these 
two subspaces are orthogonal and can be identified in a natural way with H, 


and H2, respectively. More generally, given any Hilbert spaces H,, Ho,..., 


Hf, ..., by the direct sum 


H=H,®H,0:::@H,@::: 
is meant the set of all sequences 


h = (hy, ho, ... 5 har. -)d (h,, € H,,) 
such that 


2 llAall < 0%, 


with linear operations defined in the obvious way and the scalar product of 
two elements h = (hy, hg, ..., hy, -.-), & = (81, Ba++ > Bn» --) defined by 


(h, 8) = 3 (tw 8) 


16.8. Characterization of Euclidean spaces. Given a normed linear space 
R, we now look for circumstances under which R is Euclidean. In other 
words, we look for extra conditions on the norm of R which guarantee that 
the norm be derivable from some suitably defined scalar product in R. 


THEOREM 15. A necessary and sufficient condition for a normed linear 
space R to be Euclidean is that 


If + gl? + IF— gl? = 2cLf 1? + Igll*) (28) 
for every f, gER. 
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Proof. Thinking of f + g and /f — gas the “diagonals of the parallelo- 
gram in R with sides fand g,”’ we can interpret (28) as the analogue of a 
familar property of parallelograms in the plane, i.e., the sum of the 
squares of the diagonals of a parallelogram equals the sum of the 
squares of its sides. The necessity of (28) is obvious, since if R is 
Euclidean, then 


If+el? + If-—ePW=Ct+eft+aotd-zf-2 
=SAANA+2Agxg+&29+t/ 
— 2(f, g) + (g; g) 


= 2(If I? + Ilgll?). 
To prove the sufficiency of (28), we set 
(Ff, &) = a(lf + all? — If — gl’), (29) 


and show that if (28) holds, then (29) has all the properties of a scalar 
product listed on p. 142. Since (29) implies 


CAS) = M26? — IF — FIP) = WI, (30) 


the scalar product (29) clearly generates the given norm ||| in.R. More- 
over, it follows at once from (29) and (30) that 
1) Uf) > 0 where (f, f) = 0 if and only if f= 0; 


2) (f,8) = &.S). 
The proof of the linearity properties 
(f+ 8,4) = (7,4) + (8, A) (31) 
and 
(af, g) = «(f, g) (32) 


requires a little work. To prove (31), consider the function of three 
vectors 


O(f, 8,4) = 417 + 8,4) — (Fh) — (g, ADI, 
or equivalently 
O(F,8,h) = If tg +l? — If—g — al? — IF + Al? + LP — Al? 
— llg + Al? + Ig — Alp (33) 
after using (29). It follows from (28) that 
If +g thi? =2 f+ Al? + 2 Ig? — Ifth— gl’. (4) 
Substituting (34) into (33), we get 
Of, gh) = —-If+h— gl? + If—A4—gl? + If + Aller 
Sf HIF 2 Al lige A (35) 
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Taking half the sum (34) and (35), we find that 


O(f, 8,4) = ile +h + fil? + lg +h — SI) 

aE ner fle ele ae = Fi?) 

—Ilg — Al? + llg — All’, 
which becomes 

O(f, gh) = (lg + All? + IF) — Clg — All? — MFI) 
lg + Al? + lg — All? =0 
after applying (28) to both expressions in parentheses. But O(/, g, h) =0 
is equivalent to (31). 
To prove (32), we introduce the function 
o(c) = (fg) — c(F, 8), 


where f and g are fixed but arbitrary elements of R. It follows at once 
from (29) that 


9(0) = 4(Igl? — gi?) = 0 
and ~(—1) = 0, since (—f, g) = —(f, g). Hence, given any integer 7, 
(nf, g) = (sgnn(f+-:-:+/f),g) =senal(4g)+°°°+ (ha) 
— [7 spn nf, g) = nf, g£), 


i.e., p(m) = 0. Moreover, given any integers p, g (q 4 0), 
1 1 
(Fs é| = P(-f s| = Pa( if z) =" (f,9), 
q q q \4 q 


i.e., p(c) = O for all rational c. But ¢(c) is a continuous function of c 
(why ?), and hence ¢(c) = 0, which is equivalent to (32). § 


Example 1. The n-dimensional space R?, equipped with the norm 


n 1/p 
Ine = (ln) 
k=1 


is a normed linear space if p > 1 (see Example 10, p. 41) and a Euclidean 
space if p = 2 (see Example 1, p. 144). However, RJ fails to be Euclidean 
if p 2. In fact, for the two vectors 


f= ,1,0,...,0), 

g= (il, —1,0,...,9), 
we have 

ftg=(2,0,90,...,0), 

f-—g=(,2,0,..., 0), 
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and hence 
Iflp=Igl,>=2"?, If+el = If—el =2. 
Therefore the “parallelogram condition” (28) fails if p 4 2. 


Example 2. Consider the space Cy 2, Of all functions continuous on the 
interval [0, «/2], and let 


f@) = cost, g(t) = sint. 


Then 
IF = lg = 1, 
and 
f+ gill = max |cost + sint| = ,/2, 
0<t<r/2 
f— gl = max |cost — sint| = 1. 
0<t<r/2 
Therefore 


If + gl? + IF — gl? A 21 1? + tig h?). 


It follows that the norm in C,, ,.;.; cannot be generated by any scalar product 
whatsoever, i.e., the space C,, ./.; fails to be Euclidean. It is easy to see that 
the same is true of the space C,,,,, for any a and b (a < b). 


16.9. Complex Euclidean spaces. Besides real Euclidean spaces, we can 
also consider complex Euclidean spaces, i.e., complex linear spaces equipped 
with a scalar product. However, we must now modify the properties of the 
scalar product listed on p. 142, since in the complex case these properties 
are contradictory as they stand. In fact, it follows from properties 2) and 
3), p. 142 that 

(Ax, Ax) = A(x, x), 


and hence, after choosing A = i, that 
(ix, ix) = —(x, x), 
i.e., the norms of the vectors x and ix cannot both be positive, contrary to 
property 1). To remedy this difficulty, we define the scalar product in a 
complex Euclidean space R as a complex-valued function (x, y), defined for 
every pair of elements x, y € R, with the following properties: 
1’) (x, x) > O where (x, x) = O if and only if x = 0; 


2’) (x, y) = (y, x); 
3°) (Ax, y) = Mx; Y); 
4’) (x.y +2) = (%, z) + Y, 2) 


(valid for all x, y, z€ R and all complex A). It follows from 2’) and 3’) that 


(x, Ay) = (Ay; x)= ALY, x) ae Ax, y) 
(as usual, the overbar denotes the complex conjugate). 


164 LINEAR SPACES CHAP, 4 


Example I. The space C” introduced in Example 2, p. 119 becomes a 
complex Euclidean space if we define the scalar product of two elements 
KS Oy wets XV AH Opsces yn Cas 


n 
(x, y) — > Xedn- 
k=1 


Example 2. The complex space /, with elements x = (x1, X2,..., Xz3---)s 
tees (Vis Ya, -- . Vio « ea) 6 2s where 


2 x 2 
DIX < 00, Dy" << 0,..., 


becomes an infinite-dimensional complex Euclidean space when equipped 
with the scalar product 


(%, y) — > eV e- 
k=1 


Example 3. The space C?, ,, of all complex-valued functions continuous 
on the interval [a, 5], equipped with the scalar product 


b —— 
(f, 8) =] "Paty at, 
is another example of an infinite-dimensional complex Euclidean space. 


The norm (length) of a vector in a complex Euclidean space is defined 
by the same formula 


ix] = V(x, x) 


as in the real case. However, the notion of the angle between two vectors 
x and y plays no role in the complex case, since the quantity 


(x, y) 

lll yt 
is in general complex and hence cannot be the cosine of a real angle. On 
the other hand, the notion of orthogonality is defined in the same way as 
before, i.e., two elements x and y of a complex Euclidean space are said 


to be orthogonal if (x, y) = 0. 
Let {9,} be any orthogonal system in a complex Euclidean space R, and 
let f be any element of R. Then, just as in the real case, the numbers 


1 


ar SS Ere 
| xl” 


f, Px) 


and the series 


fo @) 
> anPx 
k=1 
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are called the Fourier coefficients and the Fourier series of the function f, 
with respect to the system {9}. In the complex case, Bessel’s inequality 
(17’) becomes 


Slee Noel? < 1 


If the system {¢,} is orthonormal, the Fourier coefficients become 
ay, = Cy = (Ff; Px)> 


and Bessel’s inequality simplifies to 


io @) 
Lled? < Wl’. 
k=1 


By a complex Hilbert space is meant a complex Euclidean space which is 
complete, separable and infinite-dimensional. Theorem 11 carries over at 
once to the complex case, with isomorphism being defined exactly as in 
Definition 6: 


THEOREM 11’ ([somorphism theorem). Any two complex Hilbert spaces 
are isomorphic. 


Proof. This time show that every complex Hilbert space is isomorphic 
to the complex space |,, the “coordinate realization’? of a complex 
Hilbert space. | 


Remark. As an exercise, the reader should state and prove the complex 
analogues of all the other theorems of Sec. 16. 


Problem 1. Prove that in a Euclidean space, the operations of addition, 
multiplication by numbers and the formation of scalar products are all 
continuous. More exactly, prove that if x, + x, y, > y (in the sense of 
norm convergence) and A, — A (in the sense of ordinary convergence), then 


Xn FVn eX FY, MaXn > AX, (Xs Vn) > (x, y). 
Hint. Use Schwarz’s inequality. 


Problem 2. Let R be the set of all functions f defined on the interval [0, 1] 
such that 


1) f(t) is nonzero at no more than countably many points ft, f2,... ; 


2) Dt) < 


Define addition of elements and multiplication of elements by scalars in the 
ordinary way, ie., (f + g)(t) =f(¢) + g(t), AO = af(t). If f and g are 


two elements of R, nonzero only at the points t,, t.,... and t,,¢),..., 
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respectively, define the scalar product of fand g as 


(f.2) => sedate). 


Prove that this scalar product makes R into a Euclidean space. Prove that R 
is nonseparable, i.e., that R contains no countable everywhere dense subset. 


Problem 3. Give an example of a (nonseparable) Euclidean space which 
has no orthonormal basis. Prove that a complete Euclidean space (not 
necessarily separable) always has an orthonormal basis. 


Problem 4. Prove that every nested sequence of nonempty closed bounded 
convex sets in a complete Euclidean space (not necessarily separable) has a 
nonempty intersection. 


Comment. Cf. Problem 6, p. 66 and Problem 2, p. 141. 


Problem 5. Given a Euclidean space R, let 9), oo, ..., 9, ... be an 
orthonormal basis in R and f an arbitrary element of R. Prove that the 


element 
n 
ff - 244% 
k=1 
is orthogonal to all linear combinations of the form 


PA 
if and only if 
ay, = (f, 9x) (S12 seg): 


Problem 6. According to elementary geometry, the length of the perpen- 
dicular dropped from a point P to a line L or plane II is smaller than the 
length of any other line segment joining P to L or Il. What is the natural 
generalization of this fact to the case of an arbitrary Euclidean space? 


Hint. Use Theorem 6 and the result of the preceding problem. 


Problem 7. Let R be a complete Euclidean space (not necessarily separ- 
able), so that R has an orthonormal basis {p,}, by Problem 3. Prove that 
every vector fe R satisfies the formulas 


f=TfoV ISP = ZIG eh 


where neither sum contains more than countably many nonzero terms. 


Problem 8. Give an example of a Euclidean space R and an orthonormal 
system {¢,,} in R such that R contains no nonzero element orthogonal to every 
Qn» even though {¢,} fails to be complete. 
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Comment. By Theorem 10, R cannot be complete. 


Problem 9. Given a Euclidean space R, not necessarily complete, let R* 
be the completion of R as defined in Sec. 7.4. Define linear operations and 
the scalar product in R* by “continuous extension” of those in R © R%*. 
More exactly, if x, ~ x, y, — y where x,, y, € R, let 


x+ y=lim(x,+ y,), «x =limax,, (x, y) =lim(s,, y,). 
Prove that a 


a) These limits exist and are independent of the choice of the sequences 
{xn}, {yn} In R converging to x and y; 
b) R* is itself a Euclidean space. 


Complete Cj, ,; in this way, and show that the resulting space is a Hilbert 
space. 


Comment. The elements belonging to the completion of CZ, ,, but not to 
Cha, ») are themselves functions, in fact discontinuous functions whose squares 
are Lebesgue-integrable on [a, 5], as defined in Sec. 29. 


Problem 10. Prove that each of the following sets is a subspace of the 
Hilbert space /,: 


a) The set of all (x1, xo,-..,X,z,..-) € /, such that x, = x; 
b) The set of all (x, X2,...,X,,.--) €/, such that x, = 0 for all even k. 


Problem 11. Show that every complex Euclidean space of finite dimension 
n is isomorphic to the space C” of Example 1, p. 164. Generalize Problem 9 
to the case where Cj,» is the complex space of Example 3, p. 164. 


17. Topological Linear Spaces 


17.1. Definitions and examples. Specification of a norm is only one way 
of introducing a topology into a linear space. There are many situations in 
analysis, notably in the theory of generalized functions (to be discussed 
in Sec. 21), where it is desirable to use other methods of equipping a linear 
space with a topology: 


DEFINITION 1. By a topological linear space is meant a set E with the 
following properties: 


1) E is a linear space; 

2) E is a topological space; 

3) The operations of addition of elements of E and multiplication of 
elements of E by numbers (real or complex) are continuous with 
respect to the topology in E, in the sense that 
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a) IfZy = Xo + Yo, then, given any neighborhood U of the point Zo, 
there are neighborhoods V and W of the points x9 and yo, 
respectively, such that x + y € U whenever x EV, ye W; 

b) If agx%9 = Yo, then, given any neighborhood U of the point yo, 
there is a neighborhood V of the point xy and a number « > 0 
such that ax € U whenever x € V, |a — al < «. 


THEOREM 1. Let E be a topological linear space, and let U be any 
neighborhood of zero. Then the set 


U+x=(yiy=x+x,xE VU} 
is a neighborhood of X9. Moreover, every neighborhood of xq is a set of this 
form, i.e., some neighborhood of zero “shifted by the vector xy.” 


Proof. It follows from property 3a) that the mapping f(x) = x — Xo 
carrying £ into itself is continuous. Hence, by Theorem 10, p. 87, the 
preimage f—1(U) of any neighborhood U of the point zero is itself a 
neighborhood. But f/-1(U) = U + xg. Therefore U + xq is a neighbor- 
hood, obviously of the point x9. Similarly, given any neighborhood V 
of the point xo, let U = V— x» = V + (—x,). Then U is a neighbor- 
hood of zero, by the continuity of the mapping g(x) = x + xo. But 
clealyU+x,=V. | 


Remark. Thus the topology in E is determined by giving a neighborhood 
base at zero, i.e., a system %, of neighborhoods of zero with the property 
that, given any open set G © Econtaining the point zero, there is a neighbor- 
hood Ne€ % contained in G. In fact, the mapping f(x) = x + Xp carries a 
neighborhood base at zero into a neighborhood base at x». Hence -% 
and its “translates,’’ i.e., the system of all sets of the form {V:V = U + x, 
Ue, x € E}, constitute a base for the topology in £. In this sense, .% 
“generates” the topology in E. 


Example I. Every normed linear space is clearly a topological linear 
space. In fact, it is an immediate consequence of the properties of a norm 
that the operations of addition of vectors and multiplication of vectors by 
scalars are continuous with respect to the topology “induced”’ by the norm. 


Example 2. Let R® be the linear space of all numerical sequences x = 


(x,,...,X,,---), real or complex, and let % consist of all sets of the form 
Uy. ois pie = (HX ER”, [xz,) << 8,0..05 |x, <8} 
for some number e > 0 and positive integers k,,...,,. Then R® becomes 


a topological linear space when equipped with the topology generated by 
N19 


?® As an exercise, verify that 7, and its translates satisfy Theorem 2 (or Theorem 3) 
of Sec. 9.3 and that the linear operations in R© are continuous with respect to the topology 
generated by /%. 
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Example 3. Let Kjqy; be the linear space of all infinitely differentiable 
functions on the interval [a, b],2° and let % consist of all sets of the form 


U, 6 = {9:9 © Kja,o}) 19 (X)| <€,... 5 19° (x)| < € for all x € [a, b]} 


for some number « > 0 and positive integer r. Then K,,,,; becomes a topo- 
logical linear space when equipped with the topology generated by this 
neighborhood base (again supply some missing details). 


DEFINITION 2. A subset M of a topological linear space E is said to be 
bounded if, given.any neighborhood U of zero, there is anumber « > 0 such 
that M © «U = {z:z = ax, x € U}."! 


DEFINITION 3. A topological linear space E is said to be locally bounded 
if it contains at least one nonempty bounded open set. 


THEOREM 2. Every normed linear space E is locally bounded. 


Proof. Given any ¢ > 0, the set of all x € E such that ||x|| < « is 
obviously nonempty, bounded and open. | 


DEFINITION 4. A topological linear space E is said to be locally convex 
if every nonempty open set in E contains a nonempty convex open subset. 


THEOREM 3. Every normed linear space E is locally convex. 


Proof. Merely note that every nonempty open set in £ contains an 
open sphere. | 


Remark. It follows from Theorems 2 and 3 that every normed linear space 
is both locally bounded and locally convex. Conversely, it can be shown that 
every locally bounded and locally convex topological linear space satisfying 
the first axiom of separation is normable, in the sense that E can be equipped 
with a norm |||| generating the given topology in E, via the metric e(x, y) = 


|x — yl}. 


17.2. Historical remarks. For some time it was thought that the concept 
of a normed linear space (introduced in the thirties, notably in the work of 
Banach) was general enough to serve all the concrete needs of analysis. 
However, it subsequently became apparent that this was not so and that 
there are a number of problems involving such spaces as the space of in- 
finitely differentiable functions, the space R® of all numerical sequences, 
etc., in which the natural topology cannot be specified in terms of any norm 
whatsoever. Thus topological linear spaces, as opposed to normed linear 


20 A function ¢ is said to be infinitely differentiable if it has derivatives p*) of all orders 
k =0,1,2,... (the zeroth derivative (©) is just the function 9 itself). 

41 A sequence {x,} of points in E is said to be bounded if the set {x1, x2, ...,Xny-+ Ss 
consisting of all terms of the sequence, is bounded. 
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spaces, are by no means “‘exotic’’ or “pathological.’’ On the contrary, some 
of these spaces are no less natural and important a generalization of finite- 
dimensional Euclidean space than, say, Hilbert space. 


Problem 1. Reconcile Definition 2 with Problem 1, p. 141 in the case 
where E is a normed linear space. 


Problem 2. Let E be a topological linear space. Prove that 


a) If U and V are open sets, then sois U+V={z:z=x+y,xeEU, 
yYEV}; 

b) If Uis open, then so is xU = {z:z = ax, x € U} provided that a ¢ 0; 

c) If F< Eis closed, then so is «F for arbitrary «. 


Problem 3. Prove that a topological linear space is a 7,-space if and only 
if the intersection of all neighborhoods of zero contains no nonzero elements. 


Problem 4. Prove that a topological linear space E automatically has the 
following separation property: Given any point x € £ and any neighborhood 
U of x, there is another neighborhood V of x such that [V] ¢ U. 


Hint. If U is a neighborhood of zero, then, by the continuity of sub- 
traction, there is a neighborhood V of zero such that 


V—Ve={ziz=x—-—y, xEV,yEV} c UV, 


Suppose y¢€[V]. Then every neighborhood of y, in particular V + y, 
contains a point of V. Hence theré is a point z € V such that z + ye V. It 
follows thatyeV—Vc U. 


Problem 5. Prove that a topological space T has the separation property 
figuring in Problem 4 if and only if for each point x € T and each closed set 
F< T not containing x, there is an open set O, containing x and an open set 
O, containing F such that 0; ON O, = ©. 


Comment. Thus, for 7,-spaces, this separation property is “halfway 
between”’ that of a Hausdorff space and that of a normal space. 


Problem 6. Given a topological linear space E, prove that 


a) If {x,} is a convergent sequence of points in £, then the set M@ = 
{X1, Xo.--+5Xns---} IS bounded; 

b) A subset M ¢ E is bounded if and only if, given any sequence {x,,} 
of points in M and any sequence {e,,} of positive numbers converging 
to zero, the sequence {e,,x,,} also converges to zero. 


22 Here the minus sign in V — V does not have the usual meaning of a set difference 
(the same kind of notation was used in Sec. 14.5). 
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Problem 7. Prove that 


a) The space R® of Example 2, p. 168 is not locally bounded; 
b) Every locally bounded topological linear space satisfies the first axiom 
of countability. 


Problem 8. Let x be any point of a locally convex topological linear 
space E, and let U be any neighborhood of x. Prove that x has a convex 
neighborhood contained in U. 


Hint. It is enough to consider the case x = 0. Suppose U is a neighbor- 
hood of zero. Then there is a neighborhood V of zero such that V —V ¢ U, 
where V — V is the same as in the hint to Problem 4. Since E is locally 
convex, there is anonempty convex open set V’ ¢ V. If x, € V’, then V’ — x, 
is a convex neighborhood of zero contained in U, 


Problem 9. Prove that an open set U in a topological linear space is 
convex if and only if U + U = 2U. 


Problem 10. Given a linear space E, a set U © E is said to be symmetric 
if x € U implies —x € U. Let @ be the set of all convex symmetric subsets 
of E such that each coincides with its own interior. Prove that 


a) @ is a system of neighborhoods of zero determining a locally convex 
topology + in E which satisfies the first axiom of separation; 

b) The topology 7+ is the strongest locally convex topology compatible 
with the linear operations in E; 

c) Every linear functional on £ is continuous with respect to rt. 


Problem 11. Two norms ||-||, and ||-||, in a linear space E are said to be 
compatible if, whenever a sequence {x,} in E is fundamental with respect 
to both norms and converges to a limit x € E with respect to one of them, it 
also converges to the same limit x with respect to the other norm. A linear 
space £ equipped with a countable system of compatible norms ||-||,, is said 
to be countably normed. Prove that every countably normed linear space 
becomes a topological linear space when equipped with the topology 
generated by the neighborhood base consisting of all sets of the form 


U,. = {xix ek, |x ]]1 <e,..., |x|], < e} (1) 
for some number « > 0 and positive integer r. 


Problem 12. Prove that each of the following spaces is countably normed, 
i.e., in each case verify the compatibility of the given system of norms ||-|,: 


a) The space K,,,,; of infinitely differentiable functions on [a, b], equipped 
with the norms 
Ifln= sup [fl (w= 0,1,2,...); (2) 
astSb 


0O<kSn 
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b) The space S, of all infinitely differentiable functions f(t) on (— 0, 00) 
such that f(t) and all its derivatives approach zero as |t| — oo faster 
than any power of 1/{t| (i-e., such that 12f"(1) — 0 as|t| — oo for 
arbitrary p and q), equipped with the norms 


fl, = sup (ee (n=0,1,2,...); 
D,A<n 


c) The space ® of all numerical sequences MGs ag yp es) SUD 
that 
D kx, 
k=1 
converges for alln = 0,1,2,..., equipped with the norms 


IIx Il, -/ k*x? = =(n=0,1,2,...). 
k=1 
Show that (1) and (2) define the same topology in K;,,,,) as in Example 3, 
p. 169. 
Comment. ® might be called the space of “rapidly decreasing sequences.” 


Problem 13. A norm ||-||; is said to be stronger than a norm ||-||, if there is 
a constant c > 0 such that |x||,\> c ||x||, for all x € E (then |-||, is said to 
be weaker than ||-||,). Discuss the norms (2) in this language. 


Comment. Two norms are said to be comparable if one is stronger than 
the other, and equivalent if one is both stronger and weaker than the other 
(cf. Problem 7, p. 141). 


Problem 14. Prove that every countably normed space satisfies the first 
axiom of countability. 


Hint. Replace the system of neighborhoods U,. by the subsystem such 
that « takes only the values 


4, 


$78 8 9 5 5 © @ 


(this can be done without changing the topology). 


Comment. Thus the topology in E can be described in terms of convergent 
sequences (recall Sec. 9.4). 


Problem 15. Prove that the topology in a countably normed space can be 
specified in terms of the metric 


+ J lx — ylln 
x, — oa a ee x, yEE). 3 
p(x, Y) 2 Te p= 51 (x, y € ) (3) 
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First verify that p(x, y) has all the properties ofa metric, and is invariant 
under shifts in the sense that p(x + z,y + z) = e(x, y) for all x,y, zE£. 


Comment. A countably normed space is said to be complete if it is 
complete with respect to the metric (3). 


Problem 16. Prove that a sequence {x,} in a countably normed space is 
fundamental with respect to the metric (4) if and only if it is fundamental 
with respect to each of the norms ||-||,. Prove that {x,} converges to an 
element x € E with respect to the metric (3) if and only if it converges to 
x with respect to each of the norms ||-|],. 


Comment. Thus, in particular, a countably normed space E is said to be 
complete if a sequence {x,} in E converges whenever it is fundamental with 
respect to each of the norms ||-[f,. 


Problem 17. An infinite-dimensional separable linear space H equipped 
with a countable system of scalar products (-,°), is said to be countably 
Hilbert if the norms 

IxIn= VG, (EH) 
generated by these scalar products are compatible and if the space H is 
complete. Prove that the space ® of Problem 12c is countably Hilbert when 
equipped with the scalar products 


(x, Yn = Da (n Ts 0, I. 2; as Js 


where x = (%1,--.,Xz,+--), Y= (Yas--+sYys-+-) are any two elements of ®. 


Problem 18. The norms |||, in a countably normed space E can be 
assumed to satisfy the condition 
IIxlk< [xh if k<d, (4) 


since otherwise we can replace ||-||,, by 


Wn = sup {I> lla Ile lles- +s Ie Wade 


(Prove that this does not change the topology in £.) Let E,, denote the 
completion of EZ with respect to the norm ||-|,,. Using (4), prove that 


E, > E, > re DED eee 
Clearly, 
Ec f)E,. 
n=1 


Prove that E is complete if and only if 


E=f)\E,,. 


n=l 
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Problem 19, Let Ci”), be the space of all functions defined on the interval 
[a, b] with continuous derivatives up to order n inclusive, equipped with the 
norm 


If, = sup [f(| 
axt<b 
0<k<n 

(note that C{°}, = Ciao}). Prove that Ci}, is complete. Prove that Kj,» 


equals the intersection 
A) fn) 
n 
Nef? 
n=0 


and hence is complete (by Problem 18). 


2 


LINEAR FUNCTIONALS 


18. Continuous Linear Functionals 


18.1. Continuous linear functionals on a topological linear space. A (real) 
functional f defined on a topological linear space E is said to be /Jinear on E if 


f(ax + By) = af (x) + BSG) 


for all x, y ¢ E and arbitrary numbers «, B (recall Sec. 13.5), and continuous 
at the point x, € E if, given any < > 0, there is a neighborhood U of x, such 
that 

f(x) — f%o)l < (1) 


for all x € U (recall Sec. 9.6). We say that the functional f is continuous (on 
E) if it is continuous at every point x, € E. 


THEOREM 1. Let f be a linear functional on a topological linear space E, 
and suppose f is continuous at some point x» ¢ E. Then f is continuous on 
E, i.e., at every point of E. 


Proof. Given any point y¢£ and any number « > 0, let U be a 
neighborhood of x, such that x « U implies (1). Then 


V=U+(y—%) = {2:2 =x +y—X%,xE U} 


is a neighborhood of y, by Theorem 1, p. 168. Moreover, x € V implies 
x + xX» — y € U and hence 


If) — FO) = IF + x0 — y) — fF Co) < 
i.e., fis continuous aty. J 
[75 
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COROLLARY. The continuity of a linear functional on a topological linear 
space need only be checked at a single point, for example, at the point zero, 


THEOREM 2. Let f be a linear functional on a topological linear space E. 
Then f is continuous on E if and only if f is bounded in some neighborhood 
of zero} 


Proof. Suppose fis continuous on £, in particular at the point zero. 
Then, given any ¢ > 0, there is a neighborhood of zero in which 
[f(x)| <«. Obviously, fis bounded in this neighborhood. 

Conversely, suppose f is bounded in some neighborhood U of zero, 
so that | f(x)| < C for all x e U, where C is a suitable constant. Then, 
given.any « > 0, we have | f(x)| < « for all x in the neighborhood 


AU = {2:2=ExxeU}, 
C C 


i.e., fis continuous at zero and hence on allof £. § 


THEOREM 3. A necessary condition for a'linear functional f to be 
continuous on.a topological linear space E is that f be bounded on every 
bounded set, The condition is also sufficient if E satisfies the first axiom of 
‘countability. 


Proof. To prove the necessity, suppose fis continuous on £. Then f 
is bounded in some neighborhood U of zero: 


I< C (re). 


Let M © E be any bounded set, as defined in Definition 2, p. 169. Then 
M c~ «U for some « > 0, and hence 


If(~|< Cx  (xEM), 


i.e., f is bounded on M. 
As for the sufficiency, let {U,,} be a countable neighborhood base at 
the point zero such that 


U,> U, >> US ees 


(cf. the proof of Theorem 7, p. 84). If f fails to be continuous on E£, it 

cannot be bounded on any of these neighborhoods of zero. Therefore in 

each U,, there is a point x, such that |f(x,)| >. The sequence {x,} is 
bounded (recall footnote 21, p. 169), and even converges to zero, while 
..the sequence {f(x,)} is unbounded.. But then / fails. to be bounded.on 
‘the bounded set {x,, X2,...,Xn,...}, contrary to hypothesis. . J 


Guided by Theorem 3, we introduce 


* Recall footnote 14, p. 110. 
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DEFINITION 1. Given a linear functional f on a topological linear space 
E, suppose f is bounded on every bounded subset of E. Then f is said to be 
a bounded linear functional. 


Remark. In general, a bounded linear functional need not be continuous. 


18.2. Continuous linear functionals on a.normed linear space. Suppose 
E is a normed linear space, so that in particular E satisfies the first axiom of 
countability (recall the remark on p. 83). Then, by Theorem 3, a linear 
functional on E£ is continuous if and only if it is bounded. But by a bounded 
set in a normed linear space we mean a set contained in some closed sphere 
|x|| < C (recall Problem 1, p. 141). Therefore a linear functional f on a 
normed linear space is bounded (and hence continuous) if and only if it ‘is 
bounded on every closed sphere ||x|| < C, or equivalently on the closed unit 
sphere ||x|| < 1, because of the linearity of f. In other words, f is bounded 
if and only if the number 


If ll = sup [f@)| (2) 
: . lz [| <1 
is finite. 
DEFINITION 2. Given a bounded linear functional f on a normed linear 


space E, the number (2), equal to the least upper bound of |f(x)| on the 
closed unit sphere ||x|| < 1, is called the norm of f. 


THEOREM 4. The norm || f || has the following two properties: 


fl = sup Tt - (3) 


fol < fil lx forall xe E. (4) 
Proof. Clearly, 


If = up if (x)| = sup If (x)| 


(why?). But the set of all vectors in E of norm 1 coincides with the set 
of all vectors 


5] (x e iE. x a 0), (5) 
x 
and hence 

If ll = ie WANED (= i)\- o#0 [xl 


which proves (3). use since the vectors (5) all have norm 1, it 
follows from (2) that 


Ws || = <Ifl (xeE,x +0), 
(=) ix 


which implies (4) for x # 0. The validity of (4) for x = Ois obvious. j 
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Example 1, Let R” be Euclidean n-space, and let a be any fixed nonzero 
vector in R”. Then the scalar product 
f(x) =(%,a) (we R*) 


defines a functional on R” which is obviously linear. By Schwarz’s inequality, 


IF) = 1, OI < [il lal. (6) 

Therefore fis bounded and hence continuous on R”. It follows from (6) that 
x 

tal #0). (7) 


The right-hand side of (7) is independent of x, and hence 
pO 2 


oa ll x || 


If < llall. 


i.e., 


But choosing x = a, we get 


If(@I = I(@, a)| = llall?, 


If(a)| 
lal 


fll = lal. 


Example 2, More generally, let R be an arbitrary Euclidean space, and 
let a be a fixed element of R. Then the same argument as in the preceding 
example shows that the scalar product 


f(x) = (x4) (xe R) 


defines a bounded linear functional on R, with norm 


Ifill = lla}. 


or equivalently 


= |lal. 
It follows from (3) that 


Example 3. The integral 
I(x) =x) dt 


is a linear functional on the space C,,,). Since 


MeL = | [Px de] <max|x(1( — a) = Ix — a), 


where the equality holds if a = const, we see that the functional J is 
bounded, with norm 
Z| =b —a. (8) 
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Example 4. More generally, let yo(t) be a fixed function in C,, ,,, and let 


I(x) = J’x@yo(t) dt. 


Then J is a linear functional on C,,,). Since 


ual = f?xCyo(e) ae] < txt [lyo(0I ae, 


where the equality holds if x(t) = const, the functional J is bounded, with 
norm 


b 
Il = [vol ae. (9) 
Note that (9) reduces to (8) in the case y)(t) = 1. 
Example 5. As in Example 3, p. 124, let 
54() = X(to) 
be the linear functional on C,,,,; which assigns to each function x(t) € Cia 9; 
its value at some fixed point tf, € [a, 5]. Clearly 
|x(to)| < max |x(t)| = [Ix], 
a@xt<b 


where equality holds if x(t) = const. Hence $,, is bounded, with norm 


IS, = 1. 


The concept of the norm of a bounded linear functional on a normed 
linear space can be given a simple geometric interpretation. As shown in 
Theorem 4, p. 127, every nontrivial linear functional f can be associated 
with a hyperplane 

M, = {x:f(x) = 1}. 
Let d be the distance from the hyperplane M, to the point x = 0, defined as 
d = inf ||x|| 
f(x)=1 


(cf. Problem 9, p. 54). Since, as always 


IFC)1 < IFT led, 
(x) = 1 implies 


1 
\|x | > ial (x € M,), 


i.€., 
1 


d>-—. 
Fl 


(10) 
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On the other hand, it follows from (3) that, given any « > 0, there is an 
element x, such that f(x) = 1 and 


(If ll — ©) llxell < 1. 
Therefore 
= inf |x| <—+—, 
f(a)=1 Wal —e 
and hence 
J 
te (11) 
NAl 


since « > 0 is arbitrary. Comparing (10) and (11), we get 


gel 
fll 
i.e., the norm of the linear functional f equals the reciprocal of the distance 
between the hyperplane f(x) = 1 and the point x = 0. 


18.3. The Hahn-Banach theorem for a normed linear space. Let f(x) be a 
linear functional defined on a subset L of a linear space E£, satisfying the 
condition 


\fo(x)] < p(x), (12) 


where p is a finite convex functional on E. Then, according to the Hahn- 
Banach theorem (Theorem 5, p. 132), f, can be extended onto the whole 
space E without violating the condition (12) As applied to bounded linear 
functionals on a normed linear space £, this result can be formulated as 
follows: 


THEOREM 5 (Hahn-Banach). Given a real normed linear space E, let 
L be a subspace of E and fy a bounded linear functional on L_ Then fy can 
be extended to a bounded linear functional f on the whole space E without 
increasing its norm, 1.e., 


IF llon neo Il foll on L 


Proof. We need only choose the functional p in Theorem 5, p. 132 to 
be the convex functional k ||x||, where 


k= I foll on L° a 
This form of the Hahn-Banach theorem has a simple geometric interpreta- 
tion. The equation 
So(x) = 1 (13) 
specifies a hyperplane in the subspace L, at distance 
1 


fall 
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from the origin (the point x = 0) The fact that the functional f can be 
extended onto the whole space E without increasing its norm means that the 
hyperplane (13) can be extended to a larger hyperplane in the whole space 
E in such a way that the distance between the larger hyperplane and the 
origin is the same as the distance between the hyperplane (13) and the origin. 

In the same way, starting from the complex version of the Hahn-Banach 
theorem (Theorem 5’, p. 134), we get 


THEOREM 5’. Given a complex normed linear space E, let L be a 
subspace of E and f, a bounded linear functional on L. Then f, can be 
extended to a bounded linear functional f on the whole space E without 
increasing its norm,.1.é., 

. fllon 2 = Wollon x 


In the case of an arbitrary topological linear space £, a nontrivial con- 
tinuous linear functional on E may not even exist. However, by imposing 
suitable restrictions on E, we can guarantee the existence of “‘sufficiently 
many”’ continuous linear functionals on E.? 


DEFINITION 3. A topological linear space E is said to have sufficiently 
many continuous linear functionals if for each pair of distinct points 
X1, X, € E there exists a continuous linear functional f on E such that 
f(%) A f(%2), or equivalently, if for each nonzero element x, & E there 
exists a.continuous linear functional on E such that f(x,) 4 0. 


THEOREM 6. Every normed linear space E has sufficiently many con- 
tinuous linear functionals. 


Proof. Given any nonzero element x,¢ £, we define a linear 
functional 
So(A%o) = 


on the set Z of all elements of the form Ax». We then use the Hahn- 
Banach theorem to extend f, onto the whole space E. This gives a 
continuous linear functional on E£ such that f(x) =140. Jj 


Problem I, Prove that a functional f on a 7,-space E is continuous at a 
point x € £ if and only if x, — x implies f(x,) > f(x). 


Problem 2. Prove that every linear functional on a finite-dimensional 
topological linear space is automatically continuous. 


Problem 3. Let E be a topological linear space. Prove that a linear 
functional f on £ is continuous if and only if 


a) Its null space {x:f(x) = 0} is closed in E; 
b) There exists an open set U © E and a number ¢ such that ¢ ¢ f(U). 


2 See Theorem 6 and Problems 7-8. 


182 LINEAR FUNCTIONALS CHAP. 5 


Problem 4, Given a topological linear space E, prove that 


a) If every linear functional on EF is continuous, then the topology in 
E is the topology t of Problem 10, p. 171; 

b) If £ is infinite-dimensional and normable, then there exists a non- 
continuous linear functional on E; 

c) If E has a neighborhood base at zero whose power does not exceed 
the algebraic dimension of E, then there exists a noncontinuous linear 
functional on E. 


Hint. In b) use the existence of a Hamel basis in E (recall Problem 4, 
p. 128, where algebraic dimension is also defined). 


Problem 5. Prove that 
f(x) = ax(0) + bx(1), 


a(x) = fo" x(a) at — J” x(0) at 


are both bounded linear functionals on the space Cjo,,. What are their 
norms? 


Problem 6. As in Problem 11, p. 171, let E be a countably normed space 
with norms ||°||,,, where 


lla < [xlle< +++ < [xl <ce (14) 


(as in Problem 18, p. 173, this condition entails no loss of generality). 
Let E* be the set of all continuous linear functionals on E£, and let E* be 
the set of all linear functionals on E which are continuous with respect to 
the norm ||:||,,. Prove that 


Ee Eee be ats 
and 


E* = UE*. (15) 
n=1 

Hint. If f is a continuous linear functional on £, then, by Theorem 2, 
there is a neighborhood U of zero in which f is bounded. It follows from 
(14) and the definition of the topology in E that there is a number « > 0 and 
a positive integer k such that the open sphere ||x||, < ¢ is contained in U. 
Being bounded on this sphere, f is bounded and continuous with respect to 
the norm ||.|!,. 


Comment, Let f be a continuous linear functional on £, i.e., let fe E*. 
Then by the order of fis meant the smallest integer n for which fe E*. It 
follows from (15) that every continuous linear functional on £ is of finite 
order. 
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Problem 7. Prove that every countably normed space E has sufficiently 
many continuous linear functionals. 


Hint. Given any nonzero element x, € E, use Theorem 6 to construct a 
linear functional f continuous with respect to the norm |||, such that 


I (Xo) # 9. 


Problem 8. Show that every real locally convex topological linear space 
E satisfying the first axiom of separation has sufficiently many continuous 
linear functionals. 


Hint. Given any nonzero element x, ¢ E, show that there is a convex 
symmetric? neighborhood U of zero such that x,¢U. Let py be the 
Minkowski functional of U. Then, as in the proof of Theorem 6, p. 136, 
Py is a finite convex functional on £ such that py(—x) = py(x) and 


Puls) <1 if xeU, Pu(%) > 1. 


Define a linear functional fo(Ax)) = A on the set L of all elements of the 
form Ax. Clearly | fo(x)| < po(x) on L and fo(x,) = 1. Now use the Hahn- 
Banach theorem to extend fj onto the whole space E. 


Comment. The importance of locally convex spaces is mainly due to this 
property (which continues to hold in the complex case). 


19. The Conjugate Space 


19.1. Definition of the conjugate space. The operations of addition of 
functionals and multiplication of functionals by numbers are defined in the 
obvious way: 


DEFINITION 1. Let f and g be two functionals defined on a topological 
linear space E, and let « be any number. Then by the sum of f and g, 
denoted by f + g, is meant the functional whose value at every point x € E 
is the sum of the values of f and g at x, while by the product of « and f, 
denoted by af, is meant the functional whose value at every point x € E is 
the product of « and the value of f at x. More concisely, 


(f + g)(x) =f() + gQ), 
af (x) = af (x) 


for every x EE. 


Clearly, if fand g are linear functionals, then so are f + g and af. More- 
over, if f and g are bounded (and hence continuous), so are f + g and af. 


8 Recall Problem 10, p. 171. 
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Let E* be the set of all continuous linear functionals on E. Then the 
space E*, called the conjugate space of E, is itself a linear space, when 
equipped with the operations of addition of functionals and multiplication 
of functionals by numbers, This can be seen at once by verifying the three 
axioms in Definition 1, p. 118. Note that the zero element in £* is the 
functional f = 0, equal to zero for all x € E. 

The next step is to introduce a topology in E*, besides the linear operations 
just described. This can be done in various ways. First we consider the 
particularly simple case where the original space E is a normed linear space. 


19.2. The conjugate space of a normed linear space. Let f be a continuous 
linear functional on a normed linear space FE. In Sec. 18.2 we introduced the 
concept of the norm of /, equal to 


[f(x 
fll = sup Lol 
z#0 |x| 
(recall Theorem 4, p. 177). This quantity clearly has all the properties of a 
norm, as listed on p. 138. In fact, 


1) ||| > 0 where || f|| = 0 if and only if f= 0; 


2) flo] = Joel Ns 
3) If +all < fll + llgll, since obviously 


ap FO + 80! — guy LOI, gy, el, 

x# 0 || | 2#0 ||x|| z#o0 ||x|l 
Hence the space E* conjugate to E can be made into a normed linear space 
by simply equipping each functional fe E* with its norm || /||. The corre- 
sponding topology in E£* is called the strong topology in E*. In cases where 


we want to emphasize that E* is equipped with the norm ||-||, we will write 
(E*, ||-||) instead of E*, 


Example 1, Let E be Euclidean n-space (real or complex), and let 
€1,..., €, be any basis in £, so that every vector x € E has a unique repre- 
sentation of the form 

ee 
k=1 
If fis a linear functional on £, then clearly 


fe) =¥ fled (1) 


Thus a linear functional on £ is uniquely determined by its values on the 
basis vectors e,,...,€,, where these values can be assigned arbitrarily. 
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Consider the linear functionals f,,..., f, defined by 


Flee) 1 if j=k, 
ew 10 if FFE. 
It is clear that these functionals are linearly independent, and moreover that 
Si) = x;. 


Hence we can write (1) in the form 


fo => Sex) fel. 


Thus the functionals f,,...,/,, form a basis in the space E*, called the dual 
of the basis e,,...,e, in the original space £. Therefore E* is itself an 
n-dimensional linear space. Of course, different norms in E “‘induce’’ 
different norms in E* (see Problem 1). 


Example 2. Let cy be the space of all sequences x = (X1,...,X,,...-) 
converging to zero, with norm 


|] = uP Ilxz[l. 


Then the space (cf, ||-|]) conjugate to cy is isomorphic (see footnote 17, 
p. 155) to the space /, of all absolutely summable sequences f= (fi,..., 
Tuo» +)s* With norm 


a => els 


To prove this,-we-first note that, given any element f= (fi,...5f,.... €4; 
the formula 


A) = Sxeh, 2) 


defines a functional f on the space cy, where / is clearly linear. Moreover, 
it follows from (2) that 
FOI < Ill 2 fal 


and hence 


WF < WI. (3) 


" 4A sequence {f,}, or f = (fi, ...'s feo... in “point notation,” is said to be absolutely 
summable if 


(oe) 
> If <i oe, 
k=1 
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Consider the vectors 

a= (1, 0, 0, sa )s 

e, = (0, 1, 0,...), 

&3 — (0, 0, Li. o- )s 


IN Cy, and let 


(nr) __ Ji 
mn 2 Al” 
(if f, = 0, we set f./|f.] = 0). Then x™ € co, and 
jx] < 1. (4) 
Moreover 
Fo => £ fey =Sif 
k=1 L Fre k=1 
so that 
Kim Fe) = Sf = I (5) 
It follows from (4) and (5) that < 
WA > US (6) 


(why ?). Comparing (3) and (6), we get 
Il = Wf. 


Thus the mapping carrying f into f is a “‘norm-preserving”’ mapping of 
I, into cf. We must still verify that this mapping is one-to-one and “onto” 
(see p. 5), i.e., that every functional fe c* has a unique representation of 
the form (2), where f= (fi,...,f,,--.)€4. Let x = (%4,..., Xy,..-) Elo. 
Then 


fo 0) 
x= D xe 
k=1 


where the series on the right converges in cy to the element x, since 


n 
», ito > X7Ox 
k=1 


= sup |x;,| > 0 
k>n 
asn — o. Since the functional f € c* is continuous, 


F(x) -> xpF (ex) 


(where is the continuity used ?). Hence f has a unique representation of the 
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form (2), and we need only verify that 


PAC) < 0. 
This time let 
x” = 5 Ie) 
mt Fe)” 
Noting that x ec, and |x || < 1, we find that 


. a dt (€;,) 
Ze He 


But this implies (7), since n can be made arbitrarily large. 


Fed =F(x”) < IF). 


Whether or not the original space E is complete, we have 


THEOREM 1. The conjugate space (E*, ||-||) is complete. 


187 


(7) 


Proof. Let {f,} bea fundamental sequence of functionalsin E*. Then, 


given any « > 0, there is an integer N such that n, n’ > N implies 


Ifn —Inell <, 
Fux) — fav] < Pn — Sail xl <€ || 


so that 


for every x € E. Therefore the sequence { f,,(x)} is fundamental and hence 


convergent for every x € E. Let 


T(x) = lim SAX). 
Then f is linear, since 


eT BY) Mn [Cex By) 
= lim [af(*) + Bfr(y)] = of(x) + Bf(y). 


Moreover, choosing 7 so large that ||f, — fa.,l| < 1 for all p > 0, we 


have || fiapll < |f,l| + 1 for all p > 0, and hence 


fren PO1 < USall + 1) Ix]. 
It follows that 


lim |fayo)] = FCO! < fall + 1) lel, 


so that fis bounded and hence continuous. 


To complete the proof, we now show that the functional fis the limit 


of the sequence {/,}, i.e., that 
lim || fr, — fi] = 0. 


n>@®D 


(8) 
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Given any « > 0, let n be so large that 


€ 


fn — favo o 


for all p > 0. By the definition of the norm in E*, there is a nonzero 
element x,,, € EF such that 


(9) 


| WMXn ce) — Xn,e)I & 
eat eS a ie. 
Xn, el 3 3 
where 
= Xnye 
eae | ia 
Therefore 


fn 7 fl = IFn(Un,e) —Srroltne)| a [fn+oUn,e) = fuga) | no ; 


< Fn = Fava In, ell Sir VnstGae) —f(un)| ze > 


Or 


2¢ 
Tn =| < Tnrv(Un,e) — fun)! zo 3 (10) 
after using (9) and the fact that |lw,.|| = 1. But 
lim Fn+oUn,e) = f(Unj.)s 


by the very definition of f. Hence, taking the limit as p > o in (10), 
we get 
If, —fll <e, 
which implies (8), since « > Ois arbitrary. J 
Next we examine the structure of the space conjugate to a Hilbert space: 


THEOREM 2. Let H be a real Hilbert space. Then, given any x, € H, 
the formula 
F(x) = (%,%) (eA) (11) 
defines a continous linear functional on H, with |f || = ||xl|. Conversely, 
given any continuous linear functional f on H, there is a unique element 
Xo € H such that (11) holds, with ||xo|| = ||f\l. 


Proof. Given any x)¢H, formula (11) obviously defines a linear 
functional on H. By Schwarz’s inequality, 


IF ()] = 1%, Xo) < Ill [loll (12) 
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so that f is bounded and hence continuous. Moreover || /|| = |xoll, 
because of (12) and the fact that f(x9) = ||xoll?. 

‘Conversely, let f be any continuous linear functional on H. If f= 0, 
then f obviously has the representation (11) with x, = 0 (in this case 
lxoll = || fl] = 0). Otherwise, let 

Hy = {x:f(x) = 0} 

be the null space of f. Since fis continuous, H, is a closed subspace of H. 
According to Theorem 3, Corollary 2, p. 126, the codimension of the null 
space of any nontrivial linear functional f equals 1. Therefore, by 
Theorem 14, Corollary 2, p. 159, the orthogonal complement H,, of the 
space Hy, is one-dimensional, i.e., there exists a nonzero vector yo 
orthogonal to Hy such that every vector x ¢ H has a unique repre- 
sentation of the form 

x= + Yor (13) 
where y € Hy. Clearly, there is no loss of generality in assuming that 
lLyol] = 1. Now let 

Xo =f (Yo)Vo- (14) 


Then, given any x € H, we have 
Sx) = f(y + Ayo) = VO) 
because of (13), and 
(x, Xo) = Yo; Xo) = Mf (o)Oo- yo) = Mf Oo) 


because of (14). Therefore (11) holds for all xe H. To prove the 
uniqueness of x9, suppose 


fi) =(x%,xi) (xe). (11") 
Then, subtracting (11°) from (11), we get 
(x, X9 — Xo) = 0 (x €H), 
which immediately implies x, = x, after choosing x = x) — x5. fl 


COROLLARY. The correspondence xy<> f is an isomorphism between 
H and H*, regarded as normed linear spaces. 


Proof. If 
f(x) = (xX, Xp), g(x) = (x, Yo)» 
then 
af (x) + Bg(x) = (x, ax» + Byp). 
Moreover ||xoll = If ll. 


19.3. The strong topology in the conjugate space. Let E be a normed lin- 
ear space. Then as we have seen, the conjugate space E* is itself a normed 
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linear space, and a neighborhood of zero in E* means the set of all continuous 
linear functionals on £ satisfying the condition || f|| < « for some « > 0. In 
other words, for a neighborhood base at zero in the space E* we can take 
the set of all functionals in E* such that | f(x)| < ¢« when x ranges over the 
closed unit sphere ||x|| < 1 in the space E. Suppose E is a topological linear 
space, but not a normed linear space. Then in defining the topology in E* it 
seems natural to start from an arbitrary bounded set A © E, since there is no 
longer a “unit sphere.”” This suggests 


DEFINITION 2. Let E be a topological linear space, with conjugate 
space E*. Then by the strong topology® in E* is meant the topology 
generated by the neighborhood base at zero consisting of all sets of the form 


_ U4. = tf: lf @)| < ¢ for all x € A} (15) 


for some number « > 0 and bounded set A < E.°® 


Regardless of the topology in the original set E, we have 


THEOREM 3. The conjugate space E*, equipped with the strong 
topology, is a locally convex T,-space. 


Proof. If fy ¢ E* and fp #0, then there is an element x) € E such 
that fo(xo) 4 9. Let 


e= elf (rl, A = {xo}. 


Then clearly fy, ¢ U,., and hence E* 1s a T,-space. To verify that the 
strong topology in E* is locally convex, we need only note that U, , is 
a convex set in E* for any « > 0 and any bounded set AC E. §j 


Remark. The strong topology in E* will be denoted by the symbol 5. 
In cases where we want to emphasize that E* is equipped with the strong 
topology, we will write (E*, 5) instead of E*. 


19.4, The second conjugate space. Since the set of all continuous linear 
functionals on a topological linear space E is itself a topological linear space, 
namely the conjugate space (£*, 6), we can also talk about the second 
conjugate space E** = (E*)*, i.e., set of all continuous linear functionals 
on E*, the third conjugate space E*** = (E**)*, and so on. 


THEOREM 4. Given a topological linear space E with conjugate space 
E*, let xo be any fixed element of E. Then 


Va (Sf) = f (Xp) 


° As opposed to the weak topology in E*, to be discussed in Sec. 20.3. 
§ See Problem 8. 
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is a continuous linear functional on E*. 
Proof. The linearity is obvious, since 


Yo laf + Bg) = af (xo) + Bg(r%o) = abe) + Bb.(g) (hee £*). 


As for the continuity, given any ¢ > 0, let A be a bounded subset of E 
containing x, and let U, , be the neighborhood (15). Then 


yz Pl =f Gol<_e if fEeUs., 


i.e., the functional ,, is continuous at 0 and hence continuous on the 
whole space E*. §j 


Thus the mapping 
T(x) = be(f); 
called the natural mapping of E into E*, is a mapping of the whole space 
E onto some subset x(£) of the second conjugate space E**. Clearly 7 is 
linear, in the sense that 


max + By) = flax + By) = af (x) + BFQ) = an(x) + Br(y). 


Suppose £ has sufficiently many continuous linear functionals, e.g., suppose 
E is a normed linear space or a locally convex topological linear space 
satisfying the first axiom of separation.’ Then 7 is one-to-one, since, given 
any two distinct elements x,, x, ¢ F, there is a functional fe E* such that 
f(x) ~ f(x) and hence x(x,) 4 m(x_). Being the conjugate space of (E*, b), 
E** can also be equipped with a strong topology (introduced by the obvious 
analogue of Definition 2), which we denote by b*. 

If x(£) = E**, the space E is said to be semireflexive. It can be shown 
(see Problem 9) that the inverse mapping x! carrying 7(£) into E is always 
continuous. If E is semireflexive and if 7 (as well as 7‘) is continuous, 
the space E is said to be reflexive and x then establishes a homeomorphism 
between the space E and (£**, b*). In this case, each element x € E can be 
identified with the corresponding element x(x) € E**, and hence it is con- 
venient to denote the value of a functional fe E* at the point x € E by the 
more symmetric notation 


f(x) = (f, *). 
Thus (f, x) can be regarded as a functional on E for each fixed fe E*, and as 


a functional on E* for each fixed x € E (in the latter case, x also acts like 
an element of E**). 


THEOREM 5. If E is a normed linear space (so that in particular E* 
and E** are also normed linear spaces), then the natural mapping of E 
into E** is an isometry. 


” Recall Problem 8, p. 183. 


192. LINEAR FUNCTIONALS CHAP. 5 


Proof. Given an element x € E, let ||x|| denote the norm of x in £ and 
|x||, the norm of its image in E**. We want to show that ||x|| = |lxllo. 
To this end, let f be any element of E*. Then 


IF, x)1 < IFT dl, 


1.€., 
Ih >) 
Ixl>—-—- (f#09), 
If ? 
and since the left-hand side is independent of f, 
IF, DI 
|x| > sup ~~~ = [Illa (16) 
rex” If 


On the other hand, by the Hahn-Banach theorem, for every x» € E there 
is a linear functional fo such that 


(fos Xo)l = I Soll Ixoll. (17) 


In fact, to construct such a functional, we need only set fo(x) = A for any 
element of the form Axp, and then extend fj to a functional on the whole 
space E (without changing its norm). It follows from (17) that 


I(t, »)| 
|x|, = sup ~~ > Ix. (18) 
*pex* fll 
Comparing (16) and (18), we get 
xl] = txlle. Wl 


COROLLARY. The concepts of semireflexivity and reflexivity coincide 
for a normed linear space. 


Proof. If the natural mapping z is an isometry, then obviously both 
mand m7! are continuous. [ff 


Remark. According to Theorem 5, every normed linear space E is iso- 
metric to the linear manifold m(Z) ¢ E**8. Identifying E with 7(E£), we 
can assert that E © E** in general, and E = E** if E is reflexive (or 
semireflexive). 


THEOREM 6. Every reflexive normed linear space is complete. 
Proof. If Eis reflexive, then E = E**, But E** — (£*)* is complete, 
by Theorem:1, p. 187. J 


8 The set 7(E) need not be closed. 
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Example 1. Finite-dimensional Euclidean spaces and Hilbert space are 
the simplest examples of reflexive spaces (in fact, for such spaces E = E*). 
This follows from Theorem 2 (cf. Problem 5). 


Example 2. The space cy of all sequences x = (x1,..., X,,...) converging 
to zero is an example of a complete nonreflexive space. In fact, as we saw 
in Example 2, p. 185, the conjugate space of cy is the space /, of all absolutely 
summable sequences, which in turn has the space m of all bounded sequences 
(not necessarily converging to zero) as its conjugate space (see Problem 2c). 


Example 3. It can be shown that the space C,,,, of all continuous 
functions on [a, b] is nonreflexive, and even that there is mo normed linear 
space with C,,,, as its conjugate space. 


Example 4. The space /,, where 1 < p $ 2, is an example of a reflexive 
space which does not coincide with its conjugate space. In fact, /* = 1, 
where 


ti=1, 


1 
q 


| 


and hence /** = /* = 1). 

Problem 1, Let E be Euclidean n-space (real or complex), and let 
€,;,...,@, be a basis in E. Let x,,...,.x, be the coordinates of a vector 
x € E with respect to the basis e,,...,e,, and let f*,..., f” be the coordi- 
nates of a functional fe E* with respect to the dual basis f,,...,f,. Prove 
that in each of the following pairs, the norm in E* is the norm “induced”’ 
by the corresponding norm in E: 


1/2 


a) |x| = (Sts") Ist = ( Sir) 


\/q@ 


b) [xl = (Sia) il =(3 sf) 


k=1 =1 


where = | (p,q > 1); 


c) IIx = sup [xe Ul = S14: 
ks k=1 


PK VN 


d) |Ixl =x, Wf = sup If. 
k=1 O<kSn 


Problem 2. Let 1, be the normed linear space ‘of all sequences x = 
(xX1,...5X,,..-) With norm 


7 oO 1/p 
\x|| = (,u") <0 (p>). 


k=1 
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Prove that 
a) If p > 1, the space /* conjugate to /, is isomorphic to the space /,, 
where 
P 4 


b) If p > 1, the general form of a continuous linear functional on /, is 
F(x) =e Xe fis 


where X% = (gs ie4 Xe 3) Clot SH isso these el 
c) If p = 1, /* is isomorphic to the space m of all bounded sequences 
xX = (X%1,...,X,,...) With norm ||x|| = sup |x,]. 
i 


Problem 3. Let E be an incomplete normed linear space, with completion 
E. Prove that the conjugate spaces E* and (£)* are isomorphic. 


Hint. Given any f € E*, extend f by continuity to a functional f € (£)*. 
Conversely, given any f € (£)*, let f be the restriction of f to E, namely 
the functional f(x) = f(x) for all xe E. Show that f<+/ is the desired 
isomorphism (with ||f || = || /'|\). 


Problem 4. Let E be an incomplete Euclidean space with the Hilbert 
space H as its completion, Prove that E* and H are isomorphic. 


Problem 5. Particularize Theorem 2 to the case of a finite-dimensional 
Euclidean space. 


Problem 6. Generalize Theorem 2 to the case of a complex Hilbert space. 
Hint, Write xo = f(o)yo instead of (14). The isomorphism of H and H* 


associating the functional f(x) = (x, x) with x is then “‘conjugate-linear”’ 
in the sense that af is associated with ax. 


Problem 7, Let D be the same countably normed space of “rapidly 
decreasing sequences”’ as in Problem 12c, p. 172. Find the conjugate space D*. 


Hint. Use Problem 6, p. 182. 


Ans. ®* is the space of all functionals f of the form 


F(x) => xf 


where f= (f,,..->fz. +--+) is any sequence satisfying the condition 
ce 
Dk "fe < © 
k=1 

for some nonnegative integer n. 
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Problem 8. Let E, E*, and U,, be the same as in Definition 2. Verify 
that the system U,, actually generates a topology b in E* such that the 
linear operations in E* are continuous with respect to b. Prove that if E 
is a normed linear space, then b coincides with the “norm topology’”’ of 
Sec. 19.2, 


Problem 9. Let E be a topological linear space, and let b* be the strong 
topology in E** and x the natural mapping of E£ into E**. Prove that 7 
is continuous. | 


Hint. The topology b* induces a topology 71(b*) in the space £, in 
which a set G © E is said to be open if its image 7(G) is the intersection of 
7(E£) with an open subset of (E**, b*). Show that 71(b*) is stronger than 
the original topology in E. 


Problem 10. Prove that every closed subspace of a reflexive space is itself 
reflexive. 


20. The Weak Topology and Weak Convergence 


20.1. The weak topology in a topological linear space. Let E be a topo- 
logical linear space, with conjugate space E*. Given any ¢ > 0 and any 
finite set of continuous linear functionals f,,...,/, € E*, the set 


U = Opec pits — {x: LA) < Cy ee ey [F-(x)| < e} (1) 


is open in E and contains the point zero, i.e., U is a neighborhood of zero. 
Let -% be the system of all sets of the form (1). Then .% is a neighborhood 
base at zero, generating a topology in E which is again the topology of a 
topological linear space (the details are left as an exercise). This topology is 
called the weak topology in E. Every subset of E which is open in the weak 
topology is also open in the original topology of E, but the converse may 
not be true, i.e., Yq may not be a neighborhood base at zero for the original 
topology in E. In other words, the weak topology is weaker (as defined on 
p. 80) than the original topology, as anticipated by the terminology. 
Clearly, the weak topology in E£ is the weakest topology + with the property 
that every linear functional continuous with respect to the original topology 
is also continuous with respect to t. 


20.2. Weak convergence. The weak topology in E may not satisfy the 
first axiom of countability, even in the case where E is a normed linear space. 
Hence the weak topology cannot in general be described in the language of 
convergent sequences. Nevertheless, the weak topology determines an 
important kind of convergence in E, called weak convergence, By contrast, 
the convergence in £ determined by the original topology (by the norm, if 
E is a normed linear space) is called strong convergence. 
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THEOREM 1. A sequence {x,} of.elements in a topological linear space 
E is weakly convergent to an element x, € E if and only if the numerical 
sequence {f(x,)} converges to f(x») for every fe E*, i.e., for every 
continuous linear functional f on E. 


Proof. Clearly, there is no loss of generality in assuming that x, = 0. 
Suppose f(x,) 0 for every fe E*. Then, given any “weak neighbor- 
hood” (1), let N;, be such that | f,(x,)| < ¢foralln > N,(@i=1,...,7), 
and let N = max {N,,..., N,}. Then x, €U for all n > N, ie., {x,} 
converges to x» in the weak topology. 

Conversely, suppose that for each neighborhood (1), there is an inte- 
ger N = N(U)such that x, ¢ Uforalln > N. Then obviously f(x,) — 0 
for any given f € E*, as we see by choosing fto be one of the functionals 
ii. ..- >»; figuring in the definition of U. Jj 


Specializing to the case where E is a normed linear space, we have 


THEOREM 2. Let {x,} be a weakly convergent sequence of elements in 
a normed linear space E. Then {x,} is bounded, i.e., there is a constant C 


h that 
oe Ixl<C @=1,2,...), 
Proof. Suppose {x,,} is unbounded. Then {x,,} is unbounded on every 


losed sph 
closed sphere SLA: e] = {hI S—All < 3 


in &*, in the sense that the set of numbers 


{YF xn): fE STfo, €],n = 1, 2,...} 


is unbounded for every S[fo, ¢] © E*. In fact, if the sequence {x,} is 
bounded on S[fo, ¢], then it is also bounded on the sphere 


S[O, «] = tg: llgll < ¢}, 


since if g € S[0, <], then 
fo + 8 €STfo: €1; 


(g, Xn) ca (fo sg &> Xn) Si (fo Xn)s 


where the numbers (fo, x,) are bounded, by the weak convergence of 
{x,}. But if |(g, x,)| < C for all g € S[0, ¢], then, by the isometry of the 
natural mapping of E into E**, 


1 
— sup Cg, Xn)l < 


IX,ll = sup |(g, x,)| = 
ol<1 € IIgli<e 


so that {x,,} is unbounded, contrary to assumption. It follows that if {x,} 
is unbounded, then {x,} is unbounded on every closed sphere in E*. 
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Next, choosing any closed sphere Sy © E*, we find an integer n, and 
an element f€ S, such that 


(A Xn JL > 1. (2) 


Since (f, x) depends continuously on x, the inequality (2) holds for all f 
belonging to some closed sphere S, © So. Repeating this argument, we 
find an integer n, and a closed sphere S, © S, such that 


Cf; Xn,)| > 2 


for all fe S,, and so on, where in general there is an integer n, and a 
closed sphere S; © S,_, such that 


IF; Xn) > & 


for all fe S,. At the same time, we can obviously see to it that the 
radius of the sphere S;, approaches zero ask ~ o. Since E* is complete, 
by Theorem 1, p. 187, it follows from the nested sphere theorem 
(Theorem 2, p. 60) that there is an element f contained in all the 
spheres S,. But then 


(Ax pl>k (k=1,2,...), 
contrary to the assumed weak convergence of the sequence {x,}. J 


Coro.iary 1. Let {x,} be a sequence of elements in a normed linear 
space E such that the numerical sequence {(f, x,)} is bounded for every 
feE*. Then {x,} is bounded. 


Proof. In proving Theorem 2, the weak convergence of {x,} was 
invoked only to infer the boundedness of the sequence {(fp, x,)}. fl 


Generalizing Corollary 1, we get 


CoROLLaRY 2. Let M be a weakly bounded subset of a normed linear 
space E, i.e., a subset bounded in the weak topology. Then M is strongly 
bounded, i.e., M is contained in some closed sphere. 


Proof. Suppose M contains a sequence {x,,} such that ||x,|| > 0, and 
let M’ be the set of all points x, (n= 1,2,...). Since M is weakly 
bounded, so is M’. This means that M’ is “absorbed” by any weak 
neighborhood of zero, in particular by any neighborhood 


U = {x14 OI < L fe E*}, 


in the sense that there is a number « > 0 such that M’ < «U. But then 
I(t, Xn)| < « for all n, which, by Corollary 1, contradicts the assumption 
that ||x,|| ~0. 
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COROLLARY 3. A necessary and sufficient condition for a subset M of 
a normed linear space E to be (strongly) bounded is that every continuous 
linear functional f € E* be bounded on M. 


Proof. The necessity follows at once from the inequality 


A x) < IFT Mod, 


while the sufficiency is an immediate consequence of Corollary 2 and the 
meaning of weak boundedness. § 


A useful test for weak convergence of a sequence is given by 


THEOREM 3. A bounded sequence {x,} of elements in a normed linear 
space E is weakly convergent to an element x € E if f(x,) > f(x) for 
every f € A, where A is any set whose linear hull is everywhere dense in E*. 


Proof. Let ¢ be an arbitrary element of E*, and let {p,} be a sequence 
of linear combinations of elements of A converging to ¢ (such a sequence 
exists, since A is everywhere dense in E*), Let C be such that 

|x| < C, IIx < (n= 1,2,...). 


Moreover, given any ¢ > 0, choose k so large that ||¢ — ¢,|| < & (this 
is possible, since 9, — 9). Then 


1 P(X) — POD) < 19% n) — Pen] + ]Pel%n) — Piel)! 
+ lox(x) — 9(>)I 
< Ce + Ce + |o.(%n) — 92()I. (3) 
But 9,(x,) — 9,(x) as nm —> ©, since g, is a linear combination of 
elements of A, and f(x,) — f(x) for every fe A, by hypothesis. There- 
fore we can make the right-hand side of (3) as small as we please, by 


choosing ¢ sufficiently small and n sufficiently large. It follows that 
9(x,) > 9(x) for every p € E*, Le., {x,} converges weakly tox. J 


The meaning of weak convergence in various spaces 1s illustrated by the 
following examples: 


Example 1. Given a finite-dimensional Euclidean space R”, let e;,..., e, 
be any orthonormal basis in R”, and let {x} be a sequence in R” converging 
weakly to a vector x = (X,,...,X,) € R”. Then 

(x, 2) =x > (xe) =x; (j= 1,...,2), 
i.e., for every j the sequence {x} of components of the vectors x converges 
to the corresponding component of the limit vector x. But then 
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as k — oo, so that {x)} converges strongly to x. On the other hand, strong 
convergence obviously implies weak convergence in any space. Thus we see 
that weak convergence and strong convergence are equivalent concepts in R”. 


Example 2. Let {x} be a (strongly) bounded sequence of elements of /,. 
Then {x} converges weakly to an element x € /, if 


GQ”, e;) es sari = (xX; e,) = xX; Cj = 1, 2. at )s 


where 
é, = (I,.0, 0, «...), @5 = (0, 1.0 ,0%.)54-2 


is an orthonormal basis in /,. This follows from Theorem 3, since linear 
combinations of the elements e,, e,,... are everywhere dense in /,, which 
coincides with its own conjugate space (recall Problem 2a, p. 194). Thus 
weak convergence in /, has the same interpretation in terms of components 
as in R”, i.e., for every j the sequence {x} of components of the vectors 
x*) converges to the corresponding component of the limit vector x. How- 
ever, the concepts of weak convergence and strong convergence no longer 
coincide in /,. In fact, although obviously not strongly convergent, the 
sequence of basis vectors {e,} converges weakly to zero. To see this, we note 
that by Theorem 2, p. 188, every continuous linear functional fon /, can be 
written as a scalar product 


f(x) = (x, a) 
of a variable vector x € /, with a fixed vector a = (@,,...,4,,...) €h, so 
that in particular 
F (Cx) = Ay. 


But a, ~0ask — o for every a€/,, and hence f(e,) ~0 = f(0). 


Example 3. Consider the space C,,,, of all functions continuous on 
[a, b], and let {x,,(t)} be a sequence of functions in C,, ,; converging weakly 
to a function x(t) € C,,,,;. Among the continuous linear functionals on C,,.,;, 
we have the functionals 5,,a@ < t) < b (see Example 5, p. 179), where 5,, 
assigns to each function x(t) € C,, »; its value at the fixed point fp. Clearly, 


81,(%n) —. 8,,(%) 
means that 
Xn{to) > X(to). 
Hence, if the sequence {x,,(¢)} is weakly convergent, then 


1) {x,,(t)} is uniformly bounded on [a, 5], i.e., there is a constant C such 
that |x,(t)| < C for alln = 1,2,... and all ¢  [a, 5];° 

2) {x,(t)} is pointwise convergent on [a, 5], 1.e., {x,(t)} is a convergent 
numerical sequence for every fixed t € [a, d]. 


® This follows from Theorem 2. 
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20.3. The weak topology and weak convergence in a conjugate space. Let 
E be a topological linear space, with conjugate space E*, Suppose that 
in Definition 2, p. 190, we require A to be finite instead of bounded. Then 
the resulting topology, generated by the neighborhood base at zero consisting 
of all sets of the form 


U4. = {F:\f(%)| < ¢ for all 4} (4) 


for some number ¢ > 0 and finite set A < E, is called the weak topology in 
E* instead of the strong topology. Clearly, the set (4) can also be written as 


U;, ane O 4. = {f: [f(xy =< re | f(x,)| < ¢} (4’) 


for some ¢ > 0 and points x,,...,x, € E. Since every finite set A < E is 
bounded, while in general there are bounded infinite sets in E, the weak 
topology in E* is in fact weaker than the strong topology in £* (and in 
general does not coincide with the strong topology). 

The weak topology in E* determines a kind of convergence in E*, called 
weak convergence (of functionals). Weak convergence of functionals plays 
an important role in many problems of functional analysis, in particular in 
the theory of generalized functions (to be discussed in the next section). 
Obviously, a sequence {/,,} of functionals f, € E* is weakly convergent to a 
functional f € E* if and only if {f,(x)} converges to f(x) for every x € E. 

For weakly convergent sequences of functionals, we have the following 
analogues of Theorems 2 and 3: 


THEOREM 2’. Let {f,,} be a weakly convergent sequence of continuous 
linear functionals on a Banach space E. Then {f,,} is bounded, i.e., there is 
a constant C such that 


Ifpl<C (@=1,2,...). 


Proof. The proof is the exact analogue of that of Theorem 2. Note 
that this time we must specify that E is a complete normed linear space 
(i.e., a Banach space). J 


THEOREM 3’. A bounded sequence {f,,} of continuous linear functionals 
ona Banach space E is weakly convergent to a functional f € E* if f,(x) — 
f(x) for every x € A, where A is any set whose linear hull is everywhere 
dense in E. 


Proof. The exact analogue of the proof of Theorem 3. J 


Example. Let E be the space C,, ,, of all functions continuous on [a, 5], 


and consider the functional 
8;,(x) = x(t), (5) 


as in Example 3 above. For simplicity (and without loss of generality), we 
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assume that tg = 0 € (a, 5), so that (5) becomes 
8o(x) = x(0). (6) 
Let {f,,(t)} be a sequence of functions continuous on [a, b] such that!® 


1) f(t) is positive if |t] < Sai zero if |t| > : ; 
n n 


2) Pf dt = 1 for alln = 1,2,..., 


and let 
s(x) = fF Ox(1) at. 


Then 6{”) is a continuous linear functional on C,,.,, (recall Example 4, 
p. 179). Moreover, given any function x(t) € C,,.4,, we have 


(x) = AOx( de = |" F(x dt = xe) [M” f,(0 dt = x) 
for some t € [—1/n, 1/n], by the mean value theorem for integrals, and hence 


80” (x) > x(0) = 39(x) (7) 


as n - o. Thus the sequence of functionals {5/”} converges weakly to the 
functional 5). Suppose we write (6) in the form 


Sax) = | 3(Dx(0) at, 


in terms of the “delta function” 3(t), as in Example 3, p. 124. Then, loosely 
speaking, (7) says that “the generalized function 8r) is the weak limit of the 
sequence of ordinary functions f,,(¢).” 


20.4. The weak* topology. There are two ways of regarding the space E* 
of continuous linear functionals on a given space E, either as the space 
conjugate to the original space EF, or else as an “original space’ in its own 
right, with conjugate space E**. Correspondingly, there are two ways of 
introducing a weak topology into E*, either by using neighborhoods of the 
form (4’), or else by using the values of functionals in E** on the space E*, 
as in Sec. 20.1. Clearly, the two topologies will be the same if and only if 
E is reflexive (why ?), Suppose £ is nonreflexive. Then, to avoid confusion, 
the weak topology determined in E* with the aid of E** will be called simply 
the weak topology, while the topology determined in E* with the aid of E 


10 As an exercise, give an explicit example of such a sequence { f,(1)}. 
§ Pp P q 
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will be called the weak* topology.“ Clearly, the weak* topology in E* is 
weaker than the weak topology in E*, i.e., the weak* topology has fewer 
open sets than the weak topology. Note that weak convergence as defined 
in Sec. 20.3 now means weak* convergence. 

The following theorem is important in various applications of the 
concept of weak convergence of functionals: 


THEOREM 4. Every bounded sequence { f,,} of functionals in the space E* 
conjugate to a separable normed linear space E contains a weakly* conver- 
gent subsequence. 


Proof. Since E is separable, there is a countable set of points 
X1,Xq,-++5Xn,o--. Everywhere dense in E. Suppose the sequence {/f,} 
of functionals in E*, i.e., continuous linear functionals on E, is bounded 
(in norm). Then the numerical sequence 


Air), fom), ease Stn(%); oe 


is bounded, and hence, by the Bolzano-Weierstrass theorem (see p. 101), 
{f,} contains a subsequence 


(1) ¢(1) (1) 
Lolassssofno- 


such that the numerical sequence 


fi Cait i Op: ee fy AD: cee 
converges. By the same token, the subsequence {f‘1)} in turn contains a 
subsequence 


fGo (2) 
s 947 2 © 9 mNoe7ee 
such that the sequence 

Ta Calida Se): e038 of Oa); ree 


converges. Continuing this construction, we get a system of subse- 
quences {f},k = 1, 2,... such that 


1) {f%*”} is a subsequence of {f} for allk = 1,2,...; 
2) {f®} converges at the points x1, X2,..., X,- 


Heuce, taking the “diagonal sequence” 


ti (2) (n) 
1° 24%" 9 more 9 
we get a sequence of continuous linear functionals on E such that 
(1) (2) 
fi (Xn), f2 (X,,)5 eve 


11 Read ‘‘weak*”’ as ‘*weak star.”’ 
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converges for all n. But then, by Theorem 3’, the sequence 


F(x), f2(x), «+» 
converges for allxek. J 
CoROLLARY 1. Every bounded set in the space E* conjugate to a 


separable normed linear space E is relatively countably compact in the 
weak* topology. 


Proof. An immediate consequence of Theorem 4 and the meaning of 
relative countable compactness (see Sec. 10.4). Jj 


COROLLARY 2. A subset of the space E* conjugate to a separable 
Banach space E is bounded if and only if it is relatively countably compact 
in the weak* topology. 


Proof. An immediate consequence of Theorem 2’ and Corollary 


l. fj 


As we will see in a moment, the word “‘countably”’ is superfluous in 
Corollaries | and 2. First we need 


THEOREM 5. Given a separable normed linear space E, let S be the 
closed unit sphere in E and S* the closed unit sphere in the conjugate space 
E*, Then the topology induced in S* by the weak* topology in E* is the 
same as that induced by the metric 


(f, 2) = 22 |e.) 
where {X1,...,Xy,)...} is any countable set everywhere dense in S. 


Proof. Clearly, e(/, g) has all the properties of a metric, and moreover 
is invariant under shifts, in the sense that 


ef + h, g + h) = off, 8). 
Hence we need only verify that 
1) Every “‘open sphere”’ 
Q. = (fe, 0) < ¢} 


contains the intersection of S* with some weak neighborhood of 
zero in E*; 

2) Every weak neighborhood of zero in E* contains the intersection 
of S with some Q.. 
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Let N be such that 2-* < e/2, and consider the weak neighborhood of 
Zero 


U = Un... eye = tr (fl < Fs. Maml <3 
Then fe S* M U implies 
N ro) 
0) =D 2" fai D>. 27x) 
n=1 n=N-+1 


and hence S* 1 U © Q.. This proves 1). 
To prove 2), this time let 


gia ihcony Ym, 5 = {fil yo < 5, eee 9 IK Ym)! < 5} 


be any weak neighborhood of zero in E*, where it can clearly be assumed 
that |lyil|l << 1,-.-, mil <1. Since {x,,...,x,,...} is everywhere 
dense in S, there are indices n,,...,”,, Such that 


,) 
Ie — Xnell <5 (A=1,...,m). 


Let 


N =max{m,...,"y}, €= 
Then fe S* A Q, implies 
LI MLxd <e 


and hence 


If: ¥n)l < 2%, 
in particular 


Ixy) = 2% = 2 eS 


N [oa 


Therefore f¢ S* M Q, implies 
, 5) 
IF Yel < IPs Xn] NPs Yee — Xm) < 5+ IF Ye — ny ll <8; 


sothatS* NQ,c U. J 


We can now drop the word “‘countably”’ in Corollaries 1 and 2: 


Coro.iary 1’. Every bounded set in the space E* conjugate to a separ- 
able normed linear space E is relatively compact in the weak* topology. 
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Proof. Use Theorem 5 and the fact that compactness and countable 
compactness are equivalent concepts in a metric space (see Sec. 11.2.). Jj 


COROLLARY 2’. A subset of the space E* conjugate to a separable 
Banach space E is bounded if and only if it is relatively compact in the weak* 


topology. 
Proof. Identical with that of Corollary I’. Jj 


Finally we prove 


THEOREM 6. Every closed sphere in the space (E*, b) conjugate to a 
separable normed linear space E is compact in the weak* topology. 


Proof. Every closed sphere in the space (E*, b) is closed in the weak* 
topology. In fact, since a shift in E* carries every closed set (in the 
weak * topology) into another closed set, we need only prove the assertion 
for every sphere of the form 


S.= fl Sl < ¢}. 
Suppose fo ¢ S,. Then, by the definition of the norm of the functional 
fo, there is an element x € E such that ||x|| = 1 and 
\(x)=a>c. 


But then the set 

U = {ff (x) > Ha +0} 
is a weak* neighborhood of f, containing no elements of S,. Therefore 
S, is closed in the weak* topology, and hence compact in the weak* 
topology, by Corollary 1’. J 


Remark. Theorem 6 is a special case of the following more general 
theorem, which will not be proved here: Every bounded subset of the space 
(E*, b) conjugate to a locally convex topological linear space E is relatively 
compact in the weak* topology. 


Problem 1. Given a topological linear space E, suppose £ has sufficiently 
many continuous linear functionals. Prove that Fis a Hausdorff space, when 


equipped with the weak topology. 
Problem 2. Let {x,,} be a sequence of elements in a Hilbert space H such 
that 


1) {x,} converges weakly to an element x € H; 
2) \lXnll > | xl asn— o. 


Prove that {x,} converges strongly to x, Le., ||x, — x|] > 0Oasn— o, 
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Problem 3. Prove that the conclusion of the preceding problem remains 
valid if the condition 2) is replaced by either of the following conditions: 


2") IIXnll < [xl] for all n; 
2") lim |x_l| << xl. 
Problem 4. Let H be a (separable) Hilbert space and M a bounded subset 


of H. Prove that the topology in M induced by the weak topology in H can 
be specified by a metric. 


Problem 5. Prove that every closed convex subset of a Hilbert space H 
is closed in the weak topology (so that, in particular, every closed linear 
subspace of His weakly closed). Give an example of a closed set in H which 
is not weakly closed. 


Problem 6. Show that the two conditions in Example 3, p. 199 are 
sufficient as well as necessary for weak convergence of a sequence {x,,(t)} in 
Cia,o Give an example of a weakly convergent sequence in C,, ,, which is 
not strongly convergent. 


21. Generalized Functions 


21.1. Preliminary remarks. The degree of generality attaching to the 
notion of “function” varies from problem to problem. Some problems 
involve continuous functions, others involve functions differentiable one or 
more times, and so on. However, there are a number of situations in which 
the classical notion of a function turns out to be inadequate, even when 
understood in the most general sense (1.e., as an arbitrary rule f assigning a 
number f(x) to each element x in the domain of definition of f). Here are 
two such cases: 


1) A linear mass distribution can be conveniently characterized by giving 
the density of the distribution. However, no “‘ordinary”’ function can 
specify the density corresponding to one or more points with positive 
mass. 

2) In many problems, situations arise in which various mathematical 
operations cannot be carried out. For example, a function with no 
derivative (at certain, possibly all, points) cannot be differentiated if 
the derivative is interpreted in the usual way, as an “‘ordinary”’ 
function. Of course, such difficulties can be avoided without relin- 
quishing classical definitions, by suitably restricting the class of 
“admissible functions,’ for example, by considering only analytic 
functions. However, restricting the class of admissible functions in 
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this way is often quite undesirable. Fortunately, it turns out that 
difficulties of this kind can be overcome, and just as successfully at 
that, by enlarging (rather than restricting) the class of admissible 
functions, i.e., by introducing the notion of a “‘generalized function,” 
not encountered in classical analysis. In doing so, a key role will be 
played by the concept of a conjugate space, considered earlier in this 
chapter. 


Remark. \t cannot be emphasized too strongly that the introduction of 
generalized functions is motivated by the need to solve perfectly concrete 
problems of analysis, and not merely by a desire to see how far the notion 
of function can be pushed. 


Before going into details, we indicate the basic idea behind the theory 
of generalized functions. Let f be a fixed function on the real line, integrable 
on every finite interval, and let ¢ be any continuous function vanishing outside 
some finite interval (such a function ¢ is said to be finite!), Suppose each 
g is assigned the number 


(Lo=J* So) dx, (1) 


involving the given function /, where the integration is in effect only over a 
finite interval, because of the finiteness of ». In other words, the function 
f can be regarded as a functional (a linear functional, because of the basic 
properties of the integral) defined on some space K of finite functions. 
However, there are many other linear functionals on K besides functionals 
of the form (1). For example, by assigning each function ¢ its value at the 
point x = 0, we get a linear functional which cannot be represented in the 
form (1). In this sense, the functions f can be regarded as part of a much 
larger set, namely the set of all possible linear functionals on K. The space 
K of “test functions” @ can be chosen in various ways. For example, K 
might consist of all continuous finite functions, as above. However, as will 
soon be apparent, it makes sense to require the test functions v0 satisfy rather 
stringent smoothness conditions (besides being continuous and finite). 


21.2. The test space and test functions. Generalized functions. Turning 
now to details, let K be the set of all finite functions @ on (— 00,00) with 
continuous derivatives of all orders (equivalently, the set of all infinitely 
differentiable functions), where every function » € K, being finite, vanishes 
outside some interval depending on the choice of g. Clearly K is a linear 


12 Do not confuse the notion of a finite function (which vanishes outside some finite 
interval) with the notion of a bounded function (whose range is contained in some finite 
interval). Finite functions are often called ‘“‘functions of finite (or compact) support.” 
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space, when equipped with the usual operations of addition of functions and 
multiplication of functions by numbers. Although the space K is not 
normable, there is a natural way of introducing the notion of convergence in K: 


DEFINITION 1. A sequence {y,,} of functions in K is said to converge to 
a function 9 € K if 
1) There exists an interval outside which all the functions ¢,, vanish; 
2) The sequence {p\*} of derivatives of order k converges uniformly 
on this interval to 9 for every k =0,1,2,... .¥ 


The linear space K equipped with this notion of convergence is called the 
test space (or fundamental space), and the functions in K are called test 
functions (or fundamental functions). 


DEFINITION 2. Every continuous linear functional T(g) on the test 
space K is called a generalized function on (— ©, ©), where continuity of 
T(9) means that 0, — 9 in K implies T(9,) > T(4). 


Let f(x) be a locally integrable function, i.e., a function integrable on 
every finite interval. Then f(x) generates a generalized function via the 
expression 


TA9) = (f= [® f(x) ax, (2) 


which is clearly a continuous linear functional on K. Generalized functions 
of this type will be called regular, and all other generalized functions, i.e., 
those not representable in the form (2), will be called singular. The following 
are all examples of singular generalized functions: 


Example 1. The “delta function” 
T(9) = (0) (3) 


is a continuous linear functional on K, Le., a generalized function in the 
sense of Definition 2. This functional can be written in the form 


T() = |” 3x) 9(x) ax, (4) 


where 3(x) is a ‘“‘fictitious’’ function,'4 equal to zero everywhere except at 
x = 0 and such that 


i d(x) dx = | 
-—«o 
13 As always, (9) = o,, p = 9. 


14 The term “‘delta function” will be applied to both the generalized function T(¢) and 
the fictitious function 5(x) generating T(¢) via the representation (4). 
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(these properties are of course paradoxical), since then we have, purely 
formally, 


T(9) = [° dex q(x) dx = (0) [* 80%) dx = (0). 


The advantage of regarding the delta function as a functional on the test 
space K rather than on the space C,,,, as in Example 3, p. 124 will soon 
be apparent. 


Example 2. Generalizing (3) and (4), we can write the functional 


, T(9) = ¢9(@) (3’) 
in the form 


T(¢) = [° 8x — a)o(x) dx, (4" 


in terms of the “shifted delta function” 6(x — a). 


21.3. Operations on generalized functions. Addition of generalized func- 
tions and multiplication of generalized functions by numbers are defined 
in the same way as for linear functionals in general, i.e., by the obvious 
analogue of Definition 1, p. 183 (with ¢ and K playing the roles of x and £). 
In the case of regular generalized functions, these are just the operations 
associated with the corresponding operations for “ordinary” functions. More 
exactly, if 


TA) =[" fde@ax, — T9) = f° ado) ax, 
where f and g are locally integrable and » € K, then clearly 
(T, + T,)(9) = TP) + TCP) = Ty+0(9) 


(a7,)(p) = «7,(9) = Tr,(¢) 


and 


for any number «. 


DEFINITION 3. A sequence of generalized functions {T,,} is said to con- 
verge to a generalized function T if T,,(9) > T(¢) for every 9 € K. The 
space of generalized functions equipped with this notion of convergence 
is denoted by K*. 


Remark. In other words, convergence of generalized functions is just 
weak* convergence of continuous linear functionals on K. 


We will often denote a generalized function by the symbol f, as if a 
representation of the form 


(Ko =]" fox) dx (5) 
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existed, even in the case where the generalized function is singular. Let f be 
a regular generalized function, and let « = «(x) be an infinitely differentiable 
“ordinary’’ function. Then (5) implies 


(af, 9) =[" a(x) f(e)e) dx 
=|" faalx)g(a) dx = (f, 29), 


where « obviously belongs to K. Carrying this over to the singular case, 
we get 


DEFINITION 4. The product af of an infinitely differentiable function « 
and a generalized function f is the functional defined by the formula 


(af, 9) = YF, 29). (6) 


Remark. It follows from (6) that the functional «fis linear and continuous, 
and hence itself a generalized function. 


Again let T be a regular generalized function of the form 


T(#) =|" fe) dx, (51) 


and suppose the derivative f’ exists and is locally integrable. Then it is 
natural to define the derivative of T as the functional 


ToO=fP Poe) ax. ) 


Integrating (7) by parts and using the fact that every test function ¢ vanishes 
outside some finite interval, we find at once that 


=o) =—J®, $9'@) ae, (8) 


thereby obtaining an expression for dT/dx which does not involve the deri- 
vative of f- Carrying this over to the singular case, we get 


DEFINITION 5. The derivative dT/dx of a generalized function T is the 
functional defined by the formula 


aT 
oe (9) = —T(q’). (9) 


Remark 1. The functional (9) is obviously linear and continuous, and 
hence itself a generalized function. Second, third and higher-order derivatives 
are defined in the same way. 
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Remark 2. If a generalized function is denoted by the symbol f, as in (6), 
then its derivative is denoted by /’, and (9) takes the form 


9) = — ). (9') 


It is an immediate consequence of Definition 5 that 

1) Every generalized function has derivatives of all orders; 

2) If a sequence of generalized functions {f,,} converges to a generalized 
function f (in the sense of Definition 3), then the sequence of deri- 
vatives {f,} converges to the derivative f’ of the limit function. 


Example 1. If f is a regular generalized function whose derivative exists 
and is locally integrable (in particular, continuous or piecewise continuous), 
then the derivative of f as a generalized function coincides with its derivative 
in the ordinary sense. In fact, integrating (8) by parts, we get back (7). 


Example 2. As in Example 1, p. 208, consider the delta function 


T(9) = [° 39x) dx. 
It follows from Definition 5 that 
dT 0 ’ , 
= (9) = —[23@)9'@ dx = —¢'O. 


Example 3. Consider the “‘step function’’ 


0 if x <0, 
f(x) = | (10) 


] if x>0, 
defining the linear functional 


T(9) = [2% fel) ax = |? 9x) dx. 
It follows from Definition 5 that 
aT oO, = 
© (#) = — J, 9°) dx = 9) 


since » vanishes at infinity. Hence the derivative of (10) is just the delta 
function 93(x). 


21.4. Differential] equations and generalized functions. The development 
of the theory of generalized functions was to a large extent motivated by 


*5 Equivalently, every convergent series of generalized functions can be differentiated 
term by term any number of times. 
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problems involving differential equations, particularly partial differential 
equations. We now discuss a few simple ideas concerning generalized 
functions and ordinary differential equations. The application of generalized 
functions to partial differential equations is a subject lying beyond the scope 
of this book.7® 


LemMA 1. A test function 9 can be represented as the derivative of 
another test function ¢, if and only if 


i p(x) dx = 0. (11) 


Proof. Tf g(x) = 9,(x), where ¢, is a test function, then 


[® cox) dx = (x) |" = 0. 
Conversely, 


ex) = |" galt) de 


is an infinitely differentiable function, with derivative 9 (x), and in fact 
a finite function if (11) holds, since then 9, and 9, vanish outside the 
same interval. jf 


LemMA 2. Let 9, be a fixed test function such that 


[° ex) dx = 1. (12) 
Then an arbitrary test function 9 can be represented in the form 


Y = Po + CH, 


where c is a constant and pq is a test function which is the derivative of 
another test function. 


Proof. Let 


c= J? olxddx, — golx) = ox) — g0)]® 9(x) dx. 
Then 


{° P(x) dx = 0, 


and the proof follows from Lemma l. J 


6 See e.g., A. Friedman, Generalized Functions and Partial Differential Equations, 
Prentice-Hall, Inc., Englewood Cliffs, N.J. (1963). A key role in the development of the 
theory of generalized functions was played by the pioneer work of L. Schwartz, Théorie 
des Distributions, Hermann et Cie., Paris, Volume J (1957), Volume 2 (1959). 
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THEOREM |. Every solution of the differential equation 
y= (13) 
(in the space K* of generalized functions) is a constant. 
Proof. Equation (13) means that 
OL P= —9) =0 (14) 


for every p € K. This determines the value of the functional 


(y, 0) = [* yo(x) ax 


for every function in the space K’ < K of all test functions which are 
derivatives of other test functions. In fact, 


(Y, Po) = 0 


for every ?) € K’. Let ¢ be an arbitrary test function. By Lemma 2, 
~ = % + c,, where ~ € K’ and 4, is a fixed test function satisfying 
the condition (12). We are free to give (y, 9,) any value at all, without 
violating (14). Let 
(y; $1) = « = const, 

Then 

(y, 9) = Po + C41) = > $0) + c(y, $1) = ac = const, 
and moreover y satisfies the differential equation (13). In fact, pe K 
implies — 9’ € K’ and hence 


OLX=%,.-9)=90 J 


CoROLiaRY. If two generalized functions f and g have the same deriva- 
tive, then f = g + const. 


Proof. Obvious, since (f — g)’ =0. Jj 


THEOREM 2. Given any generalized function f, there is another 
generalized function y satisfying the differential equation 


y =f). (15) 


Proof. Any generalized function satisfying (15) is called an anti- 
derivative of f. Equation (15) means that 


(y's0) = 0,9) = U0) = (4 [.°@ at) (16) 


for every p € K. This determines the value of the functional (y, ¢) for 
every function in the space K’C K of all test functions which are 
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derivatives of other test functions. In fact, 


(7, #0) = (4 — [720 at) 


for every 9) € K’. Let » be an arbitrary test function. By Lemma 2, 
© = % + c,, where 9. € K’ and 4, is a fixed test function satisfving 
(12). We are free to give (y, 9) any value at all, without violating (16). 
Let 


(y, %) = % = const. 


Then y satisfies the differential equation (15). In fact, » € K implies 
—’ € K’ and hence 


YY, 9) =0,-9) = (J [vo at) =(f,9). I 


COROLLARY. Any two antiderivatives of a generalized function f differ 
only by a constant. 


Proof. Obvious by construction or from the corollary to Theorem 


l. jj 


21.5. Further developments. We now sketch some of the many extensions 
and modifications of the notion of generalized functions. 


a) Generalized functions of several variables. Let K” be the set of all 
functions 9(x,,...,X,) of n variables with partial derivatives of all orders 
with respect to all arguments, such that every » € K” vanishes outside some 
parallelepiped 

Ap Ky 0; (i=1,...,n) (17) 
in n-space. Then K” is a linear space, with addition of functions and multi- 
plication of functions by numbers defined in the usual way. We introduce 


convergence in K” by the natural generalization of Definition 1, ie., a 
sequence {¢,} of functions in K” is said to converge to a function 9 € K” if 


1) There exists a parallelepiped (17) outside which all the functions 9, 
vanish; 
2) The sequence of partial derivatives 


0" 9, ~ 
la | (2 — ) 


converges uniformly on this parallelepiped to the partial derivative 
ae 
Oxy'+ ++ Ox,” 
for all r, a,..., &,. 
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Every continuous linear functional on K” is then called a generalized function 
of n variables, and moreover every “ordinary” function f(x,...,%,) of n 
variables integrable on every parallelepiped can be regarded as a generalized 
function, in fact the one giving rise to the functional 


fF, @) = | fedex) dx, 
where 


NS Misacat sg Me) dx = dx,:++ dx, 


and the integral is over all of n-space. Convergence of generalized functions 
is defined by the obvious analogue of Definition 3, while partial derivatives 
of generalized functions are defined by the formula 


It is clear that every generalized function of n variables has partial derivatives 


of all orders. 


b) Complex generalized functions. So far we have only considered real 
generalized functions. Suppose the test functions are now allowed to be 
complex-valued, but still finite and infinitely differentiable. Then every 
continuous linear functional on the corresponding test space K is called a 
complex generalized function. If (f, ¢) is such a functional, then 


(f, ap) = a(f, 9). 


We can also consider conjugate-linear functionals on K, satisfying the 
condition (cf. p. 123) 
(f, a9) = af, 9), 


where the overbar denotes the complex conjugate. If f is an “ordinary” 
complex-valued function on the line, there are two natural ways of associating 
linear functionals with f, i.e., 


(fo =f" fede(x) ax, 


(f, oe= [° FR) eC) dx, 


and two natural ways of associating conjugate-linear functionals with /: 
(f P)s = be f(x) (x) dx, 


(f. Oa= f° FREED) ax. 
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Each of these four choices corresponds to a possible way of embedding the 
space of ‘“‘ordinary’’ functions in the space of generalized functions. Opera- 
tions on complex generalized functions are defined by analogy with the real 
case. 


c) Generalized functions on the circle. Sometimes it is convenient to 
consider generalized functions defined on a bounded set. As a simple example, 
consider generalized functions on a circle C, choosing the test space Kg to 
be the set of all infinitely differentiable functions on C, equipped with the 
usual operations of addition of functions and multiplication of functions by 
numbers. (Note that the test functions are now automatically finite, since C 
is bounded.) Then every continuous linear functional on Kg is called a 
generalized function on the circle. Every “ordinary” function on C can be 
regarded as a periodic function on the line. In the same way, we regard 
every generalized function on the circle as a periodic generalized function, 
where a generalized function f is said to be periodic, with period a, if 


(f(x), o(% — a) = (FO), 909) 
for every test function 9 € K. 


d) Other test spaces. There are many possible choices of the test space 
other than the space of infinitely differentiable finite functions. For example, 
we can choose the test space to be the somewhat larger space S,, of all 
infinitely differentiable functions which, together with all their derivatives, 
approach zero faster than any power of 1/|x|. More exactly, a function ¢ 
belongs to S,, if and only if, given any p,q = 0,1,2,..., there is a constant 
C,, (depending on p,q and ¢) such that?” 


|xPp(x)| << C,, (—07 <x< o), 
A sequence {¢,,} of functions in S,, 1s said to converge to a function 9 € S,, if 


1) The sequence {¢'?)} converges uniformly to ~ on every finite interval ; 
2) The constants C,, in the inequalities 


Ix? A) < Cog 
can be chosen independently of n. 


There are somewhat fewer continuous linear functionals on S,, than on K. 
For example, the function f(x) = e* corresponds to a continuous linear 
functional (f, ~) on K but not on S,. 


Remark. As the theory of generalized functions has evolved, it has 
become apparent that there is no need to commit oneself once and for all 
to any definite choice of test space. Rather it is best to choose a test space 


*” As an exercise, verify that this is the same space S,, as in Problem 12b, p. 172 . 
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which is most suitable for solving the class of problems at hand. In general, 
the smaller the test space, the greater the freedom in carrying out various 
analytical operations (differentiation, passage to the limit, etc.) and the larger 
the number of continuous linear functionals on the space (why ?). However, 
we must make sure not to make the test space too small, i.e., we must require 
not only that the test functions be “sufficiently smooth”’ but also that there be 
“sufficiently many’’ of them (in the sense of Problem 9) to allow us to “‘tell 
ordinary functions'® apart.”’ 


Problem 1. In the test space K of all infinitely differentiable finite func- 
tions, let %q be the neighborhood base at zero consisting of all sets of the 
form 


Unio... ste = £99 EK, LPO < Yolx),--- 19!) < y,(x) for all x} 


Yo: 
for some positive functions Yo,..., Y, continuous on (— ©, o). Prove that 
the topology generated in K by -% leads to the same kind of convergence 
in K as in Definition 1. 


Comment. There are other topologies in K leading to the same conver- 
gence. 


Problem 2. Let K be the test space of all infinitely differentiable finite 
functions, and let K,, be the subspace of K consisting of all functions 9 € K 
vanishing outside the interval [—m, m]. We can make K,, into a countably 
normed space by setting 


lol, = sup [p™(x)] (x =0,1,2,...) 
0<k<n 
lal<m 
(cf. Problem 12a, p. 171). Verify that the topology induced in K,, by the 
system of norms ||'||,, coincides with the topology induced in K,, by the 
topology of Problem 1. Verify that the convergence in K,, induced in K,, 
by the norms ||:||,, coincides with the convergence induced in K,, by the 


convergence in Definition 1. Clearly K, © K, © ++: K, © -+-:, and 
K=UkK,,. 
m=1 


Show that a set Q © Kis bounded with respect to the topology in K if and 
only if there is an integer m such that Q is a bounded subset of the countably 
normed space K,,. 


Problem 3. Let K and K,, be the same as in Problem 2, and let T be a 
linear functional on K. Prove that the following four conditions are 


18 More exactly, regular generalized functions. 
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equivalent: 


a) T is continuous with respect to the topology of the space K; 

b) T is bounded on every bounded subset Q < K; 

c) If 9, €K and 9,-—0, then 7(¢9,)->0 (provided convergence of 
sequences is defined as in Definition 1); 

d) The restriction T,, of the functional T to the space K,, © K is a 
continuous functional on K,, for every m = 1,2,... 


Problem 4, Let 
o 1 
T(9) = J = 9x) dx (18) 


for every ¢ in the test space K. Prove that T(¢) is a generalized function 
if the integral is understood in the sense of the Cauchy principal value. 
Hint. If ¢ vanishes outside the interval [a, 6], write 


it ¢(x) — 9(0) 
q x 


dx + .° ae dx. 


[® * 9) ax = 


Problem 5. Prove that the delta function and its derivative are singular 
generalized functions. Prove that the same is true of (18). 


Problem 6. Prove that addition of two generalized functions and 
multiplication of a generalized function by an infinitely differentiable 
function « (in particular, a constant) are continuous operations in the sense 
that f, -f, ff implies f, + _ f+ f, af, af. Prove that there 
is no way of similarly defining a continuous product of two generalized 
functions, unless the functions are regular, in which case the appropriate 
definition is 7,, = 7,7, where 


TA) = {* fede) dx, T,(9) = J® a) oa) ax, 


Ti) = |, fogs) (x) dex. 


Problem 7. Let f be a piecewise continuous function on (— ©, o), 
differentiable everywhere except at the points x,, X2,...,%,,..., where it 
has jumps 

f(x, + 0) — f(x, — 0) =A, (n= 1,2,...). 
Prove that the generalized derivative of f (i.e., the derivative of f regarded as 
a generalized function) is the sum of its ordinary derivative (at the points 
where it exists) and the generalized function 


g(x) == ae — Xn). 


Comment. Note that (g, ~) reduces to a finite sum for every test function 9. 
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Probiem 8. Find the generalized derivative of the function of period 2x 
equal to 


Lah if O<x<7n, 
f(x) = \ 0 if x=0, (19) 
es if —-~7<x<0 


in the interval [—7, 7]. 


Ans, f(x) = —-3 +7 >» 6(x — 2n7). 


n=—0 

Comment. The function (19) is the sum of the trigonometric series 
eae 

sin nx (20) 


n=1 Hh 
Differentiating (20) term by term, we get the divergent series 


00 


> cos nx. 
n=1 


Hence the concept of a generalized function allows us to ascribe a definite 
meaning to a series that diverges in the ordinary sense. The same can be 
done for many divergent integrals (like those encountered in quantum field 
theory and other branches of theoretical physics). 


Problem 9, Prove that the test space K of all infinitely differentiable finite 
functions has “sufficiently many” functions in the sense that, given any two 
distinct continuous functions /f, and f;, there exists a function » € K such that 


[° Ae) dx 4 [” fe) 9x) de. 


Hint. Since f(x) = fi(x) — fo(x) 4 0, there is a point x, such that 
FS (%o) # 9, and hence an interval [«, 8] in which f(x) does not change sign. 
Let 


eM a@—a)" 9-1) (a B)? if «<x< 8, 
9(x) = 


0 otherwise. 
Then » € K and 
20 B 
[2 £)9@) dx = Mice dx + 0. 


Comment. This result can be extended to functions more general than 
continuous functions, with the help of the concept of the Lebesgue integral 
(introduced in Sec. 29). 
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Problem 10. Consider the homogeneous system of n linear differential 
equations 


n 
yi= 2 An(*)V Gi=1,...,7) (21) 
in n unknowns y,, ... » Yn, Where the a,, are infinitely differentiable functions. 


Prove that every solution of (21) in the class K* of generalized functions is a 
set of “‘ordinary”’ (in fact, infinitely differentiable) functions. 


Comment. This can be expressed by saying that every “generalized 
solution”’ of (21) is also a “‘classical solution.” 


Problem 11, Consider the nonhomogeneous system of n linear differential 
equations 


i=Laare +h G=L-.om), (22) 


where the a;, are infinitely differentiable functions and the f, are generalized 
functions. Prove that (22) has a generalized solution, which is unique to 
within a solution of the homogeneous system (21). What happens if the f; 
are “ordinary” functions? 


Problem 12, Interpret 
f(x) = > cos nx 
n=1 
as a periodic generalized function. 


Hint. Recall Problem 8. 
Problem 13. Show that S,, becomes a countably normed space when 
equipped with the system of norms 
lel, = XY sup I. + Ix!) e(x)1. 


pt+q=n —o<2<0 
0<1<p 
0<S5<¢9 
Prove that convergence of sequences in this countably normed space is 


equivalent to convergence of sequences in S,, as defined on p. 216. 


6 


LINEAR OPERATORS 


22. Basic Concepts 


22.1. Definitions and examples. Given two topological linear spaces E and 
E,, any mapping 
y = Ax (xe FE, y € Ey) 


of a subset of E (possibly £ itself) into E, is called an operator (from E to 
E,). The operator A is said to be Jinear if 


A(ax, + Bx) = “Ax, + BAX». 


Let D, be the set of all x € E for which A is defined. Then D, is called the 
domain (of definition) of the operator A. Although in general D, need not 
equal E, we will always assume that D, is a linear subspace of E, i.e., that 
x, y € D, implies «x + By € D, for all « and 8. 

The operator A is said to be continuous at the point x) © D if, given any 
neighborhood V of the point yp = AXp, there is a neighborhood U of the point 
X9 such that Ax € V for all xe UO Dy. We say that the operator A is 
continuous if it is continuous at every point x, € D4. 


Remark I. Suppose E and E, are normed linear spaces. Then it is easy 
to see that A is continuous if and only if, given any ¢ > 0, there is a 8 > 0 
such that 
lx’ — x” <8 (x’, x” € Dy) 
implies 
| Ax’ — Ax" || <<. 
221 
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Remark 2. In the case where E, is the real line, the concept of a linear 
operator reduces to that of a linear functional, and the definition of continuity 
reduces to that given on p. 175. As we will see below, much of the theory 
of linear functionals carries over in a straightforward way to the case of 
linear operators. 


Example 1. Given a topological linear space E, let Ix = x for all x € E. 
Then J is a continuous linear operator, called the identity (or unit) operator, 
carrying each element of E into itself. 


Example 2. Let E and E, be arbitrary topological linear spaces, and let 
Ox = 0 for all x € E, where 0 is the zero element of the space E,. Then O 
is a continuous linear operator, called the zero operator. 


Example 3. Suppose A is a linear operator mapping the m-dimensional 
space R™ with basis e,,..., @, into the n-dimensional space R” with basis 


€j,...,€,. If x is an arbitrary vector in R”, then 


and hence, by the linearity of A, 
y= Ax =) x,Ae;. 
j=l 


Thus the operator A is completely determined once we know the vectors in 
R” into which A carries the basis vectors e,,...,€,. Suppose we expand 
each vector Ae, with respect to the basis e|,..., e,, obtaining 


> n? 


n 
/ 
t=1 


Then 


n m m n 
l= 2 Vej =) x,Ae; = >*; > Aj 50; 
t=1 j=1 j=1 i=l 


and hence 
m 
Vi =D i5X5, 
j=1 
i.e., the operator A is completely determined by the matrix ||a,,|| made up of 
the coefficients a,;. 


Example 4. Let H, be any subspace of a Hilbert space H, and let 
H, = H © H, be the orthogonal complement of H,, so that an arbitrary 
element 4 € H has a unique representation of the form 


h=h +h, (A, € Hy, h, € He) 
(see Theorem 14, p. 158). Let 
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Then P is a continuous linear operator, called a projection operator. Inter- 
preted geometrically, P ‘‘projects the whole space H onto the subspace Ay.” 


22.2. Continuity and boundedness. A linear operator mapping £ into FE, 
is said to be bounded if it maps every bounded subset of E into a bounded 
subset of E,. The operator analogue of Theorem 3, p. 176 for functionals is 
given by 


THEOREM |. A necessary condition for a linear operator A to be con- 
tinuous on a topological linear space E is that A be bounded. The condition 
is also sufficient if E satisfies the first axiom of countability. 


Proof. To prove the necessity, suppose A is continuous and suppose 
there is a bounded set M in EF, whose image AM = {y:y = Ax, x € M} 
is unbounded in F£,. Then there is a neighborhood V of zero in EF, such 
that none of the sets 

1am (w= 1,2. se) 

n 
is contained in V. Hence there is a sequence {x,,} of elements of M such 
that none of the elements 


ven (n= 1,2,...) 
n 


belongs to V. But then the sequence 


converges to zero in E (recall Problem 6b, p. 170), while the sequence 


bo 


fails to converge to zero in £,, contrary to the assumption that A is 
continuous. 
As for the sufficiency, let {U,,} be a countable neighborhood base at 
zero in E such that 
U,> U,> a U,> eee 


If A fails to be continuous on E, then, by the operator analogue of 
Theorem 1, p. 175,’ there is a neighborhood V of zero in £, and a sequence 
{x,} in E such that 


REE Ax, EV (n = 1, 2,...). 
n 


1 As an exercise, state and prove this analogue. 
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The sequence {nx,,} is bounded in E (and even converges to zero), while 
the sequence {nAx,} is unbounded in £,, since it is contained in none 
of the sets nV. But then A fails to be bounded on the bounded set 
{x1, X2,...,Xy,---}, contrary to hypothesis. fj 


Next we consider the operator analogues of Definition 2 and Theorem 4, 
p. 177. Suppose E and E, are both normed linear spaces, so that in particular, 
E satisfies the first axiom of countability. Then, by Theorem 1, a linear 
operator A mapping E into £, is continuous if and only if it is bounded. 
But by a bounded set in a normed linear space we mean a set contained in 
some closed sphere ||x|| < C. Therefore a linear operator A on .a normed 
linear space is bounded (and hence continuous) if and only if it is bounded 
on every closed sphere ||x|| < C, or equivalently on the closed unit sphere 
|x|| < 1, because of the linearity of A. In other words, A is bounded if 
and only if the number 

|All = sup || Axl (1) 

a tell <1 
is finite. 


DEFINITION. Given a bounded linear operator mapping a normed linear 
space E into another normed linear space E,, the number (1), equal to the 
least upper bound of ||Ax\| on the closed unit sphere ||x|| < 1, is called the 
norm of A. 


THEOREM 2. The norm ||A|| has the following two properties: 


|| Ax I 
aot 2 
Ix] a 


| Ax] < ||Al] xl] for all x € E. (3) 
Proof. Clearly, 


[Aj] = sup |[Ax|| = ae || Ax| 


a |] <2 


|A || = oUP 


(why?). But the set of all vectors in E of norm 1 coincides with the set of 
all vectors 


—~ (xeEE,x <0), (4) 
|| | 
and hence 
|All = sup [Axl] = sup 4(> I = 1Axt 
Il > #0 {|x| 


which proves (2). ae since the vectors (4) all have norm 1, it 
follows from (1) that 
|| Axl 


A (ata |x| 


which implies (3) for x 4 0. The validity of (3) for x = Ois obvious. ff 


< |All (xe£,x #0), 
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22.3. Sums and products of operators. Let A and B be two operators from 
one topological linear space E to another topological linear space E,. Then 
by the sum of A and B, denoted by A + B, we mean the operator assigning 


the element y= Ax + BxeEk, 


to each x € E. The domain Dg of the sum C = A + Bis just the intersection 
D4, C\ Dg of the domains of A and B. It is clear that C is linear if A and B 
are linear, and continuous if A and B are continuous. Let E and E, be normed 
linear spaces, and suppose A and B are bounded operators. Then C = A + B 
is also bounded, with norm 

Cll < All + [BI, 


since, by Theorem 2 and Problem 10, 
|Cx|| = || Ax + Bxl| < || Axl] + [Bxll < (Al + IBID [ll 


for every x € E. 

Next, given three topological linear spaces E, E, and E,, let A be an 
operator from E to E, and B an operator from E, to E,. Then by the product 
of A and B, denoted by BA (in that order), we mean the operator assigning 


the element z = B(Ax) € E, 


to each x € E. The domain Dg of the product C = BA consists of those 
x € D, for which Ax € Dg. Again it is clear that C is linear if A and B are 
linear, and continuous if A and Bare continuous. Let E, E, and E, be normed 
linear spaces, and suppose A and B are bounded operators. Then C = BA is 
also bounded, with norm 
Cll < |All 1BIL, 
since 
| Cx] = | B(Ax)|| < Bl] Axl < [BI |All lt. 


Remark I. Sums and products of three or more operators are defined 
in the natural way, e.g., 


CBA = C(BA) = (CB)A, 
A+BH+LC=A+(B+Q=(A+B)4+C 


Note that addition of operators is associative and commutative, while 
multiplication of operators is associative but in general not commutative 
(give an example where AB ¥ BA). 


Remark 2. By the product «A of the operator A and the number « is 
meant the operator assigning the element «Ax to each x cE. Let P(E, E,) 
be the set of all continuous linear operators mapping E into E,. Then #(E, E,) 
is clearly a linear space when equipped with the operations of addition of 
operators and multiplication of operators by numbers, 
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Problem 1. Prove that every linear operator on a finite-dimensional space 
is automatically continuous (cf. Problem 2, p. 181). 


Problem 2. Let A be a linear operator mapping m-space R™ into n-space 
R". Prove that the image of R™, i.e., the set {y:y = Ax, x € R™}, has di- 
mension no greater than m. 


Problem 3. Let C,, ,, be the linear space of functions continuous on the 
interval a << x < b, equipped with the norm 


I fl = max | f(x)I. 


axend 


Let K(x, y) be a fixed function of two variables, continuous on the square 
acx<b,a<y< b, and let A be the operator defined by 


g(x) = Af(x) = [Ke y) f(y) dy. 


Prove that A is a continuous linear operator mapping C,, ,) into itself. 


Problem 4. Let Cf, ,, be the space of functions continuous on [a, d], 


equipped with the norm 
If ll = J Pe) ax, 


and let A be the same as in the preceding problem. Prove that A is a con- 
tinuous linear operator mapping C7, ,) into itself. 


Problem 5. Given a fixed function (x) continuous on [a, b], let A be the 
mapping defined by 


&(x) = Af(x) = 9(x) SO). 


Prove that A is a continuous linear operator on both spaces C,,,) and Cj, 5), 
mapping each space into itself. 


Problem 6. Let C{2’,, be the set of all continuously differentiable functions 
on [a, b], and let D be the differentiation operator, defined by 


Df (x) = f'(*) 
for all f € C{?,). Prove that 


a) Ciao) } is a linear space; 
b) Dis a linear operator mapping Ch +1 onto Cha,b)3 
c) D is not continuous on C,, 4); 
d) D is continuous with respect to the norm 


If ll, = max | f(x] + max |f"(x)]. 


axsaxenb uxrnd 
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Problem 7. Let K,,,, be the space of infinitely differentiable functions 


on [a, b], equipped with the topology generated by the countable system of 
norms 
If ln = sup |f(x)| 
axr<xdb 
0<SkSN 
(cf. Problem 12a, p. 171). Prove that the differentiation operator D is a 
continuous linear operator on K,, ,,, mapping K,, 4, onto itself. 


Problem 8. Interpret the differentiation operator as a continuous linear 
operator on the space of all generalized functions. 


Hint. Take continuity to mean that.if a sequence of generalized functions 
{f,(x)} converges to a generalized function f(x), then {/, (x)} converges to 


I '(x). 
Probiem 9. Prove that 


a) The operators in Problems 3-7 and Examples 1-4, p. 222 are all 
bounded; 

b) A linear operator on a countably normed space is continuous if and 
only if it is bounded. 


Problem 10. Let A be a bounded linear operator mapping a normed 
linear space E into another normed linear space E,. Suppose ||A|| is defined 
as the smallest number C such that || Af|| < C || /|| for all x ¢ E. Prove that 
|| Al] is the same number as in the definition on p. 224. Particularize this to 
the case of a bounded linear functional on E. 


Problem 11. Let E and E, be normed linear spaces, and let #(E, FE) be 
the same as in Remark 2 above. Prove that 


a) L(E, E,) is a normed linear space; 
b) If E, is complete, so is “CE, Ej); 
c) If E, is complete, A, € A(£, E,) and 


fe @) 
> |Axll < 2, 
k=1 


then the series 
fo @) 
> Ax 
k=l 


converges to an operator A € Y(E, E,) and 


> Ar 
k=1 


|Al] = < S IlAsl. 
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23. Inverse and Adjoint Operators 


23.1. The inverse operator. Invertibility. Given two topological linear 
spaces Eand E£,, let A be an operator from E to E£,, with domain D, © E and 
range Ry, = {y:y = Ax,x€ D4}. Then A is said to be imvertible if the 
equation 

Ax =y (1) 


has a unique solution for every y € Ry. If A is invertible, we can associate 
the unique solution of (1) with each ye Ry. This gives an operator, with 
domain Ry, called the inverse of A and denoted by A™. 


THEOREM |. The inverse A- of a linear operator A is itself linear. 


Proof. If 
Ax,;= ji, Ax, = ya; 
then 
Ay, =X, A"Ys = Xe, 
and hence 


A A*Vy + AgA* Yo = OyXy + Aye. (2) 


On the other hand, 
A(O4Xy + M%QXq) = HY + Ke, 


by the linearity of A, and hence 
A*(aVy + Vo) = OyXy + A_Xo. (3) 
Comparing (2) and (3), we get 
A (ayy + Hoyo) = %A yy + HAT, Of 


LemMa. If M is an everywhere dense subset of a normed linear space E, 
then every nonzero element y € E is the sum of a Series of the form 
VSTi Vase a err rs 
where y, € M and 


tel. eo: 


Proof. Since M is everywhere dense in E, given any y € E, there is an 
element y, € M such that 


- yl 
ly —yill < a 
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By the same token, there are elements yo, y3,...5 Yz,.-.- Such that 


p= yar. 

4 

Ip—n— vv < BE, 
Iy—-n—- <LI, 


e e© © e8© ee 8 «© © e © © ®& © © © @ & @ 


Then 


as n—» oo, by the construction of the sequence {y,}, i.e., the series 


@ 

> Ye 

k=1 
converges to y. Moreover 


[yall = Wy — vt < dv — vt + yd < BE gy = A, 
yall = ye +n —yty— yill : 
< yy — yl + ly — al < By 
and in general, 
Yell = We + Vea ttt Ym VEYA Vm Ver ll 
ly Sy et ele ly aye ll 


yl | dy _ 3 yl 
<ig tonite | 
THEOREM 2 (Banach). Let A be an invertible bounded linear operator 


mapping a Banach space E onto another Banach space E,. Then the 
inverse operator A~ is itself bounded. 


Proof. Let M, be the subset of £, consisting of all y € E, such that 
[Ay < * My. 


Every element in E, belongs to some M,, L.e., 


E, = U M,,. 
k=1 
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By Baire’s theorem (Theorem 3, p. 61), at least one of the sets M,, 
say M,,, is dense in some (open) sphere S © E,. Choosing a point 
yo ES OM,, we can find numbers « and 8 (a < §) such that S contains 
the spherical layer 


P = {zia < lz — yoll < B, z € My}. 


Shifting P so that its center coincides with the origin, we get another 
spherical layer Pp. Some set My is dense in Py. In fact, ifze PO M,, 
then z — yo € Py and 


A(z — yo) ll < AT z + AT ell < n(Hzll + Iyoll) 
< n(|z — yoll + 2 voll) 


2 Ilyoll 2 Ilyol 

rer = Vol ae a — Yoll{ t ee 
n lz vol( +e) <n le vol( -it 
(4) 


where the quantity 


x 
is independent of z. Let 


N=1-+ [y] 


(recall footnote 4, p. 8). Then, by (4), z —yoe€ My. Hence My is 
dense in P,, since M,, is dense in P. 

Now, given any nonzero element y € E,, we can always find a number 
A ~ Osuch that « < ||Ayll < B,ie., such that Ay € Po. Since My is dense 
in Py, there is a sequence {y,}, ny, € My converging to Ay. Then {y,/A} 
converges to y. Clearly, if y,¢ My, then y,/Ae My for any A #0. 
Therefore My, is dense in E, — {0} and hence in £, itself. It follows 
from the lemma that y is the sum of a series of the form 


Vi Ve ae Vee ee Va es 


where y; € My and 
3 ily 


yell < 


Consider the series 


> (5) 


with terms x, = A7y, €£, equal to the preimages of the elements 
y;, € Ey. Since 


ae 3 
x, = Amel < N liyell <N ae 
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the series (5) converges to an element x € E, where 
00 1 
xl] < > lel < 3N lly S = = 3N lly. 
k=1 k=1 2 
Since (5) is convergent and the operator A is continuous on E (being 
bounded), we can apply A term by term to (5), obtaining 


Ax = Ax, + Axg +0: + Axp toes =pptyotoo typ bo =y, 
which implies 
x = Avly. 
Moreover, 
A yll = [lxll < 3N [ll 


for all y 4 0, and hence A is bounded. J 
THEOREM 3. Let Ag be an invertible bounded linear operator mapping 


a Banach space E into another Banach space E,, and let AA be a bounded 
linear operator mapping E into E, such that 


1 
|AAl| < ——_. (6) 
| Ao" | 
Then the operator 


maps E onto E, and has a bounded inverse. 


Proof. Let y be a fixed element of £,, and consider the mapping B of 
the space E into itself defined by 


Bx = Aj'y — Ay AAx. 


It follows from (6) that B is a contraction mapping. Hence, by Theorem 
1, p. 66, B has a unique fixed point x such that 


x = Bx = Ajty — Aj AAx. (7) 
But (7) implies 
Ax = Agx + AAx = y. 


Clearly, if Ax’ = y, then x’ 1s also a fixed point of B, and hence x’ = x. 
Therefore, given any y € E,, the equation Ax = y has a unique solution 
in E, i.e., the operator A is invertible with inverse A~?. Moreover, A 
is bounded, by Theorem 2. j 


THEOREM 4. Let E be a Banach space, and let I be the identity operator 
on E. Suppose A is a bounded linear operator mapping E into itself, such 
that 

JAI <1. (8) 
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Then the operator (I — A)~ exists, is bounded and can be represented in 
the form 


Ud — Ay => AR. (9) 


Proof. The existence and boundedness of (J — A) follows from 
Theorem 3 (and will also emerge in the course of the proof). It follows 
from (8) that 


> A < >All ae 


But then, by the completeness of E, the sum of the series 
> A" 
k=0 


is a bounded linear operator (See Problem llc, p. 227). Given any n, 
we have 


Ud — A)S A* => AU — A) =I — AM 
k=0 k=0 
Hence, taking the limit as n + o and bearing in mind that 


|A"**|| < ||All"** + 0, 
we get 
(I — A) A* =I, 


k=0 


which implies (9). jj 


23.2. The adjoint operator. Given two topological linear spaces E and 
E,, let A be a continuous linear operator mapping E into E,, and let g be a 
continuous linear functional on £,, i.e., an element of the conjugate space 
E*. Suppose we apply g to the element y = Ax, thereby obtaining a new 
functional 


f(x) = g(Ax) (we B). (10) 


Clearly, f is continuous and linear (why?), and hence an element of the 
conjugate space E*. Thus (10) associates a functional fe E* with each 
functional g ¢ E*, i.e., (10) defines an operator mapping E* into E*. This 
operator is called the adjoint of A, and is denoted by A*. Using the symmetric 
notation (f, x) for the functional f(x), we can write (10) in the form 


(g, Ax) = Ff; x). 


(g, Ax) = (A*g, x). (11) 


Equation (11) can be regarded as a concise definition of the adjoint of A. 


or 
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Example. As in Example 3, p. 222, suppose A is a linear operator with 
matrix ||a,,|| mapping m-space R™ into n-space R”. Then the mapping y = Ax 
can be written as a system of equations 


y; = 2 4is%5 (i = 1, rar) n), (12) 
j= 


while the functional f(x) can be written in the form 


fe =3 fre 


where f; = f(e,) in terms of a basis e,,...,€, in R™. Since 


f(x) = g(Ax) = 2 ii = 2 2, Bites = 2% 2 Ass» 
~= Ia 1 I= j= i 
we find that 


n 
= 2, 4580 
= 


or 
f= Danes (13) 
j= 


after interchanging the roles of the indices i and j. But f= A*g, and hence 
comparing (12) and (13), we see that the matrix of the operator A® is ||a,,||, 
i.e., the transpose of the matrix of A. 


It follows at once from the definition of the adjoint of an operator that 


1) A* is linear; 
2) (A + B)* = A* + B*; 
3) (aA)* = oA* for arbitrary complex «. 


A somewhat less obvious property of the adjoint operator is given by 


THEOREM 5. Let A be a bounded linear operator mapping a Banach 
space E into another Banach space E,, and let A* be the adjoint of A. 
Then A* is bounded and 

|A*l] = |All. (14) 


Proof. By the properties of the norm of an operator, we have 


I(A*g, x)| = (8, Ax)i < llgll All lll, 
which implies 
|A*gll < |All gl, 
and hence 
|A* I] < |All. (15) 
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Suppose x € E, Ax ~ 0, and let 
Ax 
=——eE 
|| Ax|| 


so that, in particular, |{yol| = 1. Let g be the functional such that 


gAyo) =A 


ontheset L < E, of allelements of the form Ay. Then clearly (g, yp) = 1, 
lZllon p = 1. Using the Hahn-Banach theorem, we can extend g to a 
functional on the whole space E, such that ||g|| = 1 and 


(g,¥9)=1, ie, (g, Ax) = ||Ax|. 


Yo 


1? 


Therefore 
| Axl] = (g, Ax) = |(A*g, x)| < ||A* gl xl] < ]A* ll lel ll = }A* Hl Md, 


which implies 
|All < | A*]. (16) 


Comparing (15) and (16), we get (14). jj 


23.3. The adjoint operator in Hilbert space. Self-adjoint operators. Next 
we consider the case where A is a bounded linear operator mapping a (real 
or complex) Hilbert space H into itself. According to the corollary to 
Theorem 2, p. 188, the mapping + assigning the linear functional 


(ty)(x) = (x, y) 


to every y € H establishes an isomorphism between H and the conjugate 
space H*.? Let A* be the adjoint of the operator A. Then clearly the 
mapping A* = 11A*z is a bounded linear operator mapping H into itself, 
such that 

(Ax, y) = (x, A*y) (17) 


for allx, y € H. Moreover ||A*|| = ||A||, since ||A*|| = || Al] and the mappings 
+ and +? are isometric. 

We now establish the following convention: If H is a Hilbert space, then 
by the adjoint of an operator A mapping H into H, we mean the operator 
A* defined by (17). Note that 4*, like A, maps H into H. To keep the 
notation simple, we will henceforth drop the tilde, writing A* instead of 
A*. Replacing A* by A* in (17), we get 

(Ax, y) = (x, A*y) (17’) 
for all x, y © H. 


Or a “‘conjugate-linear isomorphism” in the case where H is complex (see Problem 6, 
p. 194). 
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Remark, It should be emphasized that this definition of A* differs from 
the definition of the adjoint of an operator A mapping an arbitrary Banach 
space £ into itself, in which case A* is defined on the conjugate space E* 
rather than on the space E£ itself. The context will always make it clear 
whether A* is the operator defined by (11) or the operator defined by (17’). 


Let A be a bounded linear operator mapping a Hilbert space H into itself. 
Then it makes sense to ask whether or not A = A*, since A and A* are 
defined on the same space. This leads to the following 


DEFINITION. A bounded linear operator A mapping a Hilbert space H 
into itself is said to be self-adjoint if A = A%, i.e., if 


(Ax, y) = (x, Ay) 
for allx, y€ H. 


Remark. Everything said above continues to hold if we replace H by the 
real n-space R” or complex n-space C”. 


23.4. The spectrum of an operator. The resolvent. In the theory of linear 
operators and their applications, a central role is played by the notion of 
the “‘spectrum’’ of an operator.* Let A be a linear operator mapping a 
topological linear space E into itself. Then a number A is called an eigenvalue 
of A if the equation 


has at least one nonzero solution, and every such solution x is called an 
eigenvector of A (corresponding to the eigenvalue A). Suppose E is finite- 
dimensional. Then the set of all eigenvalues of A is called the spectrum of 
A, and all other values of A are said to be regular (points). In other words, 
A is regular if and only if the operator A — AJ is invertible. The operator 
(A — XM)? is then automatically bounded, like every operator on a finite- 
dimensional space (cf. Problem 1, p. 226). Thus there are just two possibilities 
in the finite-dimensional case: 


1) The equation Ax = Ax has a nonzero solution, i.e., 4 is an eigenvalue 
of A, so that the operator (A — AJ)“ fails to exist; 

2) The operator (A — XJ) exists and is bounded, i.e., A is a regular 
point. 


However, in the case where £ is infinite-dimensional, there is a third 
possibility : 


3) The operator (A — AZ)“ exists (1.e., the equation Ax = Ax has no 
nonzero solutions), but. is not bounded. 


7 In talking about the spectrum of an operator, it will always be tacitly assumed that 
the operator is defined on a complex space. 
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To describe this more general situation, we introduce some new terminology 
and make an important modification in the definition of the spectrum. 
Given an operator A mapping a (complex) topological linear space E into 


itself, the operator R,=(A—N)? (18) 


is called the resolvent of A. The values of A for which R, is defined for all 
E and continuous are said to be regular (points) of A, and the set of all other 
values of A is called the spectrum of A. The eigenvalues of A still belong to 
the spectrum, since if (A — AJ)x = 0 for some x 4 0, then (18) fails to exist. 
The set of all these eigenvalues is now called the point spectrum, and the rest 
of the spectrum is called the continuous spectrum. In other words, the con- 
tinuous spectrum consists of all 4 for which (18) exists but fails to be 
continuous. Thus there are now exactly three possibilities for any given value 
of A: 


1) A is a regular point; 
2) A is an eigenvalue; 
3) A is a point of the continuous spectrum. 


The possibility of an operator having a continuous spectrum is a character- 
istic feature of the theory of operators in infinite-dimensional spaces, dis- 
tinguishing it from the finite-dimensional case. 


THEOREM 6. Let A be a linear operator mapping a Banach space E 
into itself. Then the set A of all regular points of A is open (equivalently, 
the complement of A is closed). 


Proof. If is regular, the operator (A — XJ)~? exists and is bounded. 
Hence, for sufficiently small 5, the operator (A — (A + 8)I)~ also exists 
and is bounded, by Theorem 3. In other words, the point 4 + 8 is reg- 
ular for sufficiently small 5. jj 


THEOREM 7. If A is a bounded linear operator mapping a Banach space 
E into itself and if |A| > ||All, then d ts a regular point, In other words, 
the spectrum of A is contained in the disk of radius \|A|| with center at the 
origin. 


Proof. Obviously 
A—MM = —a(I - } 
d 


and 


= 1 AY} 
R=] (A = by SS] Sh a Sf 
a ) | 3 


If ||Al| <A, then ||4/Al| <1, and hence R, exists and is bounded, by 
Theorem 4. j 
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Example 1. In the space C = C,, ,,, consider the operator A defined by 
Ax(t) = w(x), 


where u(f) is a fixed function continuous on [0, 1]. Then 
(A — ADx(t) = (u() — A)x(), 


{ 
u(t) — A 
Hence the spectrum of A consists of all 4 such that u(t) — A vanishes for 
some ¢ in the interval [0, 1], 1.e., the spectrum is the range of the function 


u(t). 


Example 2. Suppose u(t) = ¢ in the preceding example. Then the spec- 
trum is just the interval [0, 1]. On the other hand, there are obviously no 
eigenvalues. Thus the operator A defined by 


Ax(t) = tx(t) 


and 


(A — r»I)2x(1) = x(t). 


is an example of an operator with a purely continuous spectrum. 


Finally, for self-adjoint operators in a Hilbert space, we have the following 
analogue of a well-known result for finite-dimensional Euclidean spaces 
(proved in exactly the same way): 


THEOREM 8. Let A be a self-adjoint operator mapping a (complex) 
Hilbert space H into itself. Then all the eigenvalues of A are real, and two 
eigenvectors of A corresponding to distinct eigenvalues are orthogonal. 


Proof. If 
Ax = dx (x £0), 
then 
A(x, x) = (Ax, x) = (x, Ax) = (x, Ax) = A(x, x), 


and hence A = 2. Moreover, if 


Ax= dx, Ay=py OAFAyp), 
then 
A(x, y) = (Ax, y) = (x, Ay) = (%, wy) = oO, y) = 2, y), 


and hence 
(x, y) a 0, 


i.c., the vectors x and y are orthogonal. jj 


Problem 1. Given two normed linear spaces E and £,, a linear operator 
A from E to E,, with domain Dy, is said to be closed if x, € D4, X,—>X, 
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Ax,—y implies xe D4, Ax = y. Prove that every bounded operator is 
closed. 


Problem 2. Let E and E, be normed linear spaces, with norms ||-|| and 
II'll4, respectively. By the direct (or Cartesian) product of E and E,, denoted 
by E x E,, we mean the set of all ordered pairs (x, y), x E E, y € E,. Prove 
that E x E, is a normed linear space when equipped with the norm 


Ge, YL = ll + liylls 


(addition of elements and multiplication of elements by numbers being defined 
in the obvious way). By the graph of a linear operator A from E to FE, we 
mean the subset of E x E, equal to 


G4 = {(x, y):x € D4, y = AX}. 
Prove that 


a) G, is a linear subspace of E x E,; 

b) G, is closed if and only if the operator A is closed; 

c) If E and £, are Banach spaces and if A is closed and defined for all 
x €E, so that D, = E, then A is bounded (this is Banach’s closed 
graph theorem). 


Hint, In c) apply Theorem 2 to the projection operator P carrying each 
ordered pair (x, Ax) € G, into the element x € E. 


Problem 3. Prove that if A is an invertible continuous linear operator 
Mapping a complete countably normed space E into another complete 
countably normed space F,, then the inverse operator A“ is itself continuous. 
State and prove the closed graph theorem for countably normed spaces. 


Problem 4. Let A be a continuous linear operator mapping a Banach 
space E onto another Banach space £;. Prove that there is a constant a > 0 
such if Be &(E, E,) and ||A — B|| < «, then B also maps E onto (all of) £y. 


Problem 5. Let A be an operator mapping a Hilbert space 4 into itself. 
Then a subspace M © 4 is said to be invariant under A if x € M implies 
Ax eM. Prove that if M is invariant under A, then its orthogonal com- 
plement M’= HOM is invariant under the adjoint operator A* (in 
particular, under A itself if A is self-adjoint). 


Problem 6. Let A and B be bounded linear operators mapping a complex 
Hilbert space H into itself. Prove that 


a) (aa + BB)* = aA* + BB*; 

b) (AB)* = B*A*; 

c) (A*)* = A; 

d) 7* = J, where Z is the identity operator. 
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Problem 7. Give an example of an operator whose spectrum consists of 
a single point. 


Problem 8. Given a bounded linear operator A mapping a Banach space 
E into itself, prove that the limit 
r=lim VA’ 
Nn 
exists. Show that the spectrum of A is contained in the disk of radius r 
with center at the origin. 


Comment. The quantity r is called the spectral radius of the operator A. 
This result contains Theorem 8 as a special case, since ||A”|| < ||A||”. 


Problem 9, Let R, = (A — AI)“ and R, = (A — pt) be the resolvents 
corresponding to the points 4 and uw. Prove that RR, = R,R,.and 


R, — R, = (u — AR,R,. (19) 
Hint. Multiply both sides of (19) by (4 — AD(4 — pb. 


Comment. It follows from (19) that if Ay is a regular point of A, then 
the derivative of R, with respect to A at the point Apo, i.e., the limit 


ia Rayan — Rag 
AA70 AA 


(in the sense of convergence with respect to the operator norm) exists and 
equals Rj. 


Problem 10. Let A be a bounded self-adjoint operator mapping a complex 
Hilbert space H into itself. Prove that the spectrum of A is a closed bounded 
subset of the real line. 


Problem 1]. Prove that every bounded linear operator defined on a com- 
plex Banach space with at least one nonzero element has a nonempty 
spectrum. 


24. Completely Continuous Operators 


24,1. Definitions and examples. We now discuss a class of operators which 
closely resemble operators acting in a finite-dimensional space and at the 
same time are very important from the standpoint of applications: 


DEFINITION. A linear operator A mapping a Banach space E into 
itself is said to be completely continuous if it maps every bounded set into 
a relatively compact set. 
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Remark I. If E is finite-dimensional, then every linear operator A 
mapping E into £ is completely continuous. In fact, A maps bounded sets 
into bounded sets (recall Problem 1, p. 226) and hence maps bounded sets 
into relatively compact sets (why ?). 


Remark 2. In an infinite-dimensional space, complete continuity of an 
operator is a stronger requirement than merely being continuous (i.e., 
bounded). For example, the identity operator in an infinite-dimensional 
space is continuous but not completely continuous (see Example 1 below). 


LEMMA. Let xX;, X2,... be linearly independent vectors in a normed 
linear space E, and let E,, be the subspace generated by the vectors 
X1,...5X,. Then there are vectors yy, yo,... Such that y,€ E,, ||Yni| = 1 
and* 

o(E,-1, Yn) = inf >= Vall > 3. 
ve Hi n~1 

Proof. Since the vectors x;, X2,... are linearly independent, we have 
x, ¢ E,_, and hence 

e(En1, Xn) =a >0 


(recall Problem Sa, p. 141). Let x* be a vector in E,_, such that 


|x, — x*|| < 2a. 
Then 
e(En-1) Xn x*) = a, 
and the vectors 


—s * 
ea) ee a ee, 
I|¢3 | 


yi = 

Xn —_—* *| 

satisfy all the conditions of the lemma. J 
Example 1. The identity operator J in an infinite-dimensional Banach 
space E is not completely continuous. In fact, we need only show that the 
closed unit sphere S in E (which is obviously carried into itself by J) is not 
compact. This follows at once from the lemma, since S contains a sequence 

of vectors y;, ye,... such that 


CVn—1> Vn) > $, 
and such a sequence clearly cannot contain a convergent subsequence. 


Example 2. Let A be a continuous linear operator on an infinite-dimen- 
sional Banach space E, where A is “degenerate’’ in the sense that it maps 
E into a finite-dimensional subspace of E. Then A is completely continuous, 


4 The quantity e(E£,_1, ya) is, of course, just the distance between the set E,_, and the 
point y, (cf. Problem 9, p. 54). 
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since it maps every bounded subset M ¢ E into a bounded subset of a 
finite-dimensional space, and hence into a relatively compact set. 


Turning to the space Cj,.,) of functions continuous on the interval [a, 5], 
we now establish conditions under which the “integral operator’ A defined 


by 
ty) 
Y(x) = (A@\(x) = |KO, v)@(y) dy (1) 
is completely continuous. 
THEOREM 1. Suppose the kernel K(x, y) is such that 


1) K(x, y) is bounded on the squarea<x<ba<y<b; 
2) The discontinuities (if any) of K(x, y) all lie on a finite number of 
curves 
y=fix) (k=1,...,0), 


where the functions f;, are continuous. 
Then (1) is a completely continuous operator mapping Cyq4) into Cia »}- 
Proof. First we note that the conditions 1) and 2) guarantee the 


existence of the integral (1) for every x € [a, b], so that p(x) is defined 
on [a, b]. Let R be the squarea<x<b,a<y< bp, and let 


M = sup |K(x, y)|. (2) 
(wz weR 


Moreover, let G be the set of all points (x, y) € R such that 


& 


12Mn 


ly — fe) < 


for at least one integer kK = 1,...,n, and let F = R — G. Since F is 
compact (why?) and K(x, y) is continuous on F, given any e > 0, there is 
a 8 > 0 such that 


& 


K(x', y) — K(x’, = 3 
K(x", y) — K(X", y)I xb a) (3) 

for any two points (x’, y), (x”, y) € F satisfying the condition 
Ix’ — x"] <8 (4) 


(recall Theorem 1, p. 109). 
Now suppose (4) holds. Then 


IM’) — 471 < PIKE») — Ke", De@ldy. 6) 
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To estimate the integral on the right, we divide the intervala < y <b 
into the set 


(Cle eee eo eee ee A seas cea 
p=U Ly ly —fel < ae vU [ ly — ful < or 


and the complementary set Q = [a, b] — P. Using (2) and noting that 
P is a union of intervals of total length no greater than ¢/3M, we have 


‘i 2 
[IG ») — KO", N90) dy < + Hell (6) 
where, as usual, 
ell = sup [9(y)l. 
axsySb 


On the other hand, it follows from (3) and (4) that 
JK’) — KO", Le)! dy < = Hell (7) 


Comparing (5)-(7), we find that (4) implies 
Ib’) — VX") < ¢ [¢ll. (8) 


In particular, ) is continuous on [a, 5], so that the operator A defined by 
(1) actually maps the space C,, ,) into itself. Moreover, it follows from 
(8) and from the‘estimate. 


IY = sup [YO < sup |’1K(, y)I leQ)I dy < M(b — a) lel 
axar<b aged *4 


that A carries any (uniformly) bounded set of functions ® < C,, ,, into 
a (uniformly) bounded equicontinuous set ‘Y ¢ C,, ,, (recall Definitions 
3 and 4, p. 102). But then ¥’ is relatively compact, by Arzela’s theorem 
(Theorem 4, p. 102), and hence A is completely continuous. jj 


Remark 1. The requirement that the discontinuities of the kernel K(x, y) 
lie on a finite number of curves, each intersecting the lines x = const in a 
single point, is essential. For example, let K(x, y) be the function 
1 if x <4, 
K(x, y) = 


0 if x>4}, 


defined on the square0 < x < 1,0< y < 1. Then K(x, y) is discontinuous 
at every point of the line segment x = 4, 0 < y < I, and the operator (1) 
with this kernel maps the function x(t) = 1 into a discontinuous function. 


Remark 2. If K(x, y) = 0 for y > x, then (1) takes the form 


W(x) = (A9)() = [°K(x v)9(y) ay. 
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Suppose K(x, y) is continuous for y < x. Then it follows from Theorem 1 
that the operator A, called a Volterra operator, is completely continuous. 


24.2. Basic properties of completely continuous operators. We begin with 


THEOREM 2. Given a sequence {A,} of completely continuous operators 
mapping a Banach space E into itself, suppose {A,,} converges in norm to an 
operator A, i.e., suppose ||A — A,|| +0 as n+. Then A is itself 
completely continuous. 


Proof. To prove that A is completely continuous, we need only show 
that the sequence {Ax,} contains a convergent subsequence whenever 
the sequence {x,} of elements x, € E is bounded, i.e., such that 


IIx,ll << (9) 


for some M > 0 and all n = 1,2,... (why is A linear?). Since A, is 
completely continuous, the sequence {4,x,} contains a convergent 
subsequence. In other words, there is a subsequence {x"} of the sequence 
{x,} such that {A,x} converges. Similarly, since A, is completely con- 
tinuous, the sequence {A,x)} in turn contains a convergent subsequence. 
Thus there is a subsequence {x‘?)} of the sequence {x“)} such that {A,x'?} 
converges. Then obviously {4,x‘?)} also converges. Continuing this 
argument, we find a subsequence {x‘?)} of the sequence {x@} such that 
{Ax}, {Apx'3)}, {4,x')} all converge, and so on. Consider the 
“diagonal sequence”’ 
Rae weeeee pac 
The clearly each of the operators A,, A,,...,A,,... maps this 
sequence into a convergent sequence. 

We now show that the sequence {Ax‘”} also converges, thereby 
completing the proof. Since the space Eis complete, it is enough to show 
that {Ax‘”} is a Cauchy sequence. Clearly 


JAxg? — Arg? < Axe? — Al + Aen” — Axe? | 
+ Agen? — Axy ||. (10) 
Given any « > 0, first choose & such that 
Raiciaa 
3M 
Next, using the fact that {A,x‘™} converges and hence is a Cauchy 
sequence, choose N such that 


||A — A, || < (11) 


/ € 
|Apxn? — Ayxy? || < = (12) 
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for all n,n’ > N. Then it follows from (9)-(12) that 


Ax — Ax® cl pi4 fue 
| re ; 


for all sufficiently large n and n’, i.e., {Ax‘”} is a Cauchy sequence. Jj 


Not only is the set of completely continuous operators closed (algebra- 


ically) under operator multiplication, but we have the following much stronger 
result: 


THEOREM 3. Let A be a completely continuous operator and B a 


bounded operator mapping a Banach space E into itself. Then the operators 
AB and BA are completely continuous. 


Proof. If the set M < Eis bounded, then BM = {y:y = Bx,x eM} 


is also bounded. Therefore ABM is relatively compact, and hence AB 
is completely continuous. Moreover, if M is bounded, then AM is 
relatively compact, and hence BAM is also relatively compact by the 
continuity of B, i.e., BA is completely continuous. jj 


COROLLARY. A completely continuous operator A mapping a Banach 


space E into itself cannot have a bounded inverse if E is infinite-dimensional. 


Proof. If A} were bounded, then, by Theorem 3, the identity 


operator I = A~'A would be completely continuous. But this is im- 
possible, by Example 1, p. 240. jj 


THEOREM 4. Let A be a completely continuous operator mapping a 


Banach space E into itself. Then the adjoint operator A* is also completely 
continuous. 


Proof. We must show that A* carries every bounded subset of the 


conjugate space E* into a relatively compact set. Since every bounded 
subset of a normed linear space is contained in some closed sphere, it 
is enough to show that A* maps every closed sphere into a relatively 
compact set. In fact, by the linearity of A*, we need only show that the 
image A*S* of the closed unit sphere S* < E* is relatively compact. 


Now suppose we regard the elements of E* as functionals not on the 


whole space E but only on the compactum [AS] equal to the closure of 
the image of the closed unit sphere under the operator A. Then the set ® 
of functionals on [AS] corresponding to those in S* is uniformly bounded 
and equicontinuous, since ||¢|| < 1 implies 


sup |p(x)| = sup|@(x)| < ||¢l] sup Axl] < |] 
vel 4S] re AS zeS 


and 


lox’) — e")| < lel [leo — x" < Ix" — x". 
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Hence, by Arzela’s theorem (Theorem 4, p. 102), ® is relatively compact 
in the space C; 49) of all continuous linear functionals on [AS]. But the 
set D, with the metric induced by the usual metric of C,49, is isometric 
to the set A*S*, with the metric induced by the norm of the space E*. 
In fact, if g,, 2g, € S*, then 


|| A*g, — A*go|| = sup |(A*g, — A*g», x)| = sup |(g, — g2, Ax)| 
xe S reS 
= sup |(g, — 22, Z)| = sup |(g, — 82, Z)| = (21, 22). 
zeAS zeL AS] 


Being relatively compact, the set ® is totally bounded, by Theorem 3, 
p. 101. Therefore the set A*S* isometric to ® is also totally bounded, 
and hence relatively compact, by the same theorem. J 


THEOREM 5. Let A be a completely continuous operator mapping a 
Banach space E into itself. Then, given any 9 > 0, there are only finitely 
many linearly independent eigenvectors of A corresponding to eigenvalues 
of absolute value greater than 0. 


Proof. Given nonzero eigenvalue A of A, let E, be the subspace of E 
consisting of all eigenvectors of A corresponding to A.° Then £, is 
finite-dimensional, since otherwise A would fail to be completely con- 
tinuous in E£, and hence in E itself, by virtually the same argument as in 
Example 1, p. 240. Therefore, to complete the proof, we need only show 
that if {A,} is any sequence of distinct eigenvalues of A, then A, — 0 as 
n—» oo, This in turn will be proved once we show that theie is no infinite 
sequence {A,} of distinct eigenvalues of A such that the sequence {1/A,} 
is bounded. 

Thus, suppose there is a sequence {A,} of distinct eigenvalues of A 
such that {1/A,} is bounded, and let x, be an eigenvector of A corre- 
sponding to the eigenvalue A,. Then the vectors x,, X2,... are linearly 
independent, by the same argument as in the case where E is finite- 
dimensional.® Let E,, be the subspace generated by x,,..., X,, i.e., the 
set of all elements of the form 


nr 
y = 2 %r: 
For every y € E,,, we have 
1 n 1) oy Az n—1 Ar 
——Ay=Yda,x,—- >) —x,=)a,/1 ——)*,, 
y Xn y 2a i > Ax “ 2, ( " 


> Note that £, is invariant under A in the sense that x € Ej implies Ax € E, (cf. Problem 
5, p. 238). 
® See e.g., G. E. Shilov, op. cit., Lemma 1, p. 182. 
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so that 


1 
yr , AV EE, 


Let {y,} be a sequence such that y, € E,, |ly,|| = 1 and 


n 


O(En-1 Yn) = inf |x — Yall > 3 
ve, 1 

(such a sequence exists by the lemma on p. 240). Then {y,/A,} is a 

bounded sequence in E, since the numerical sequence {1/A,} is bounded. 

But at the same time the sequence {A(y,/A,,)} cannot contain a convergent 

subsequence, contrary to the complete continuity of A, since 


Mabe) 


qQ 


i 
>= 


[4(2) Yo\ fe 
Z 


for all p > qg, since 
Yo — * Ay, ar a(32) € Ey}. 


‘p re] 


This contradiction proves the theorem. jj 


24.3. Completely continuous operators in Hilbert space. Specializing to 
the case of completely continuous operators mapping a Hilbert space into 
itself, we have 


THEOREM 6. Let A be a linear operator mapping a Hilbert space H 
into itself. Then A is completely continuous if and only if 


1) A maps every relatively compact set in the weak topology into a 
relatively compact set in the strong topology; 

2) A maps every weakly convergent sequence into a strongly convergent 
sequence. 


Proof. To prove 1), we merely note that H is the conjugate of a 
separable space, since H = H*, and hence, by Corollary 2’, p. 205, a 
subset of H is bounded if and only if it is relatively compact in the weak 
topology. 

To prove 2), suppose A maps every weakly convergent sequence 
into a strongly convergent sequence, and let M be a bounded closed sub- 
set of H. Then M contains a weakly convergent sequence and hence AM 
contains a strongly convergent sequence, i.e., AM is relatively compact 
in the strong topology. It follows that A is completely continuous. 
Conversely, if A is completely continuous, let {x,,} be a weakly convergent 
sequence with weak limit x. Then {Ax,} contains a strongly convergent 
subsequence. At the same time, {Ax,} converges weakly to Ax, by the 
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continuity of A, so that {Ax,} cannot have more than one limit point. 
Therefore {Ax,} is a strongly convergent sequence. jj 


Let.A be a self-adjoint operator in a finite-dimensional complex Euclidean 
space, and suppose A has matrix |a,;|| (recall Example 3, p. 222). Then it 
will be recalled from linear algebra that ||a,,|| can be reduced to diagonal 
form with respect to a suitable orthonormal basis.” We now generalize this 
result to the case of a completely continuous self-adjoint operator in a (real 
or complex) Hilbert space (see Theorem 7 below), after first proving two 
preliminary lemmas: 


Lemma 1. Let A be a completely continuous self-adjoint operator 
mapping a Hilbert space H into itself, and let {x,} be a sequence in H 
converging weakly to x. Then 


(AXns Xn) > (Ax, x) (13) 
asn— ©, 


Proof. Clearly, 
|(AXns Xn) — (Ax, x) < [(AXqs Xn) — (AX; Xn)! + [CAX, Xp) — (AX, x). 


But 
(An, Xn) — (AX, XA) < [xall LA@, — DI, 
and 
|(Ax, %) a (Ax, x)| ae I(x, A(X, x x))| < |x| A, _ x)|, 


where the numbers |x,||, 7 = 1,2,... are bounded, by Theorem 2, 
p. 196, and ||A(x, — x)|| ~ 0 by Theorem 6. Therefore 


[(Axn» Xn) > (Ax, x)| +0 
as n—» 00, which is equivalent to (13). Jj 


LEMMA 2. Given a bounded linear operator A mapping a Hilbert space 
HH into itself, let A be self-adjoint and suppose the least upper bound of the 
functional 


|O(x)| = (Ax, x)| 


on the closed unit sphere \|x\| < 1 is achieved at the point x = x9. Then 


(Xo, y) =0 (14) 


(AX, y) => (Xo; Ay) a 0. 


In particular, Xo is an eigenvector of A. 


implies 


7 See e.g., V. I. Smirnov, Linear Algebra and Group Theory (translated by R. A. 
Silverman), McGraw-Hill Book Co., New York (1961), Sec. 40. Dover reprint (19/0). 
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Proof. Obviously, 


I|xoll = 1. (15) 
Let 
emma 
V1-+ lal? yl? 
where a is an arbitrary complex number. Then ||x|| = 1, because of 
(14) and (15). Since 
! : ; 
Q(x) = ————,, [Q(%p) + 2 Re a(Ax, y) + lal” Q(y)], 

1+ Jal? lly?” " 
we have 

O(x) = Q(X) + 2 Re d(Axp, y) + O(la|?) (16) 


for small |a|. But it is clear from (16) that if (Axo, y) 4 0, then a can be 
chosen to make |Q(x)| > |Q(%)|, contrary to the assumption that the 
least upper bound of |Q(x)| on the closed unit sphere is achieved at the 
point x = X». Therefore (Ax, y) = 0 as asserted, i.e., A is orthogonal 
to every vector orthogonal to xX. It follows that Ax, and x, are pro- 
portional (why ?), so that x, is an eigenvector of A. jj 


THEOREM 7 (Hilbert-Schmidt). Let A be a completely continuous self- 
adjoint operator mapping a Hilbert space H into itself. Then there is an 
orthonormal system ,, 92,... of eigenvectors of A, with corresponding 
nonzero eigenvalues dy, dz, . . . , Such that every element x € H has a unique 
representation of the form® 


X= 2 C.Pn + 2’, (17) 
where x’ satisfies the condition Ax’ = 0. Moreover 
Ax = > Anln Pn (18) 
and 
lim A,, = 0 


in the case where there are infinitely many nonzero eigenvalues. 


Proof. Let 
M, = sup |(Ax, x)I, 
]e}|<t 
xeH 
and let {x,,} be a sequence of elements of H such that ||x,|| = 1 and 


|(Ax,; Xn) as M, 
as n -> o. Since the closed unit sphere in H is weakly compact (recall 


* As will appear in the course of the proof, the sums in (17) and (18) may be finite or 
infinite, and x’ may vanish. 
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Corollary 2’, p. 205), we can find a subsequence of {x,,} which converges 
weakly to an element y € H, where clearly ||y|| < 1. By Lemma 1, 


(Ay, y)| = My, 


and hence, by Lemma 2, y is an eigenvector of A. Moreover ||y|| = 1, 
since if ||y|| < 1, then choosing 
, y 
: mere 
lly 
we would have ||y’|| = 1 and 


[(Ay’, y’)| > M;, 


contrary to the meaning of M,. We choose y as our first eigenvector 9. 
Let A, be the corresponding eigenvalue, so that 


A, = )49. 


lAal = [(4¢1, 9y)| = My. 


Next let E, be the subspace of H consisting of all vectors of the form 
a, and let E, = H © E, be the orthogonal complement of E,. Clearly 
E, is again a Hilbert space, mapped into itself by the operator A (this 
follows from Problem 5, p. 238 and the fact that A is self-adjoint). Let 


Then 


M, = sup |(Ax, x)|. (19) 
|e |] <2 
LY 
Then, by the same argument as before, we can find an eigenvector 9, of 
A such that 92 € £,, || Pel] = 1. Let A, be the corresponding eigenvalue, 
so that 
Ae = hor. 


lAel = |(A 92, 92)| = Me, 


Then 


and hence 
1A, | > [Acl, 
since H > E, implies 
M, = sup |(Ax, x)| > sup |(Ax, x)| = Mz. 
[}~]] <1 I| v {| <1 
acc H xe} 
By its very construction, 9, is orthogonal to 4,. 
To construct further eigenvectors of A, we argue inductively, re- 
placing (19) by 
M41 = sup |(Ax, x)| (i= 12s ows) 


||} <1 
tek yn 
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where E) = H © E, is the orthogonal complement of the subspace E,, 
generated by the previously constructed eigenvectors 9, 2,..., Py: 
Then E’ is again a Hilbert space mapped into itself by A, and there is an 
eigenvector 9,,,¢£, of unit norm, with corresponding eigenvalue 
Angi Satisfying the inequality 
[Anl > [Anal (n =1,2,.. .). 
In this way, we construct an orthonormal system {¢,,} of eigenvectors of A. 
There are now just two possibilities, which we examine in turn: 


Case 1. Suppose the construction of the sequence {¢,,} terminates after 
a finite number of steps, i.e., suppose there is a positive integer Ny such 
that (Ax, x) =0 on E/. Then it follows from Lemma 2 that A maps 
the whole space EF), into the zero vector. According to Theorem 14, 
p. 158, every element x € H has a unique representation of the form 


, 
x=h+x’, 


where h e€ Enos a E7,» and hence of the form 


x=) ¢,9, +x’, 
where the sum is finite (consisting of ny terms) and Ax’ = 0. Obviously 
we have 

AX =D dnln Pn» 
thereby completing the proof in this case. 


Case 2. Suppose the construction of the sequence {¢,} never termi- 
nates, i.e., suppose (Ax, x) #0 o0n E, for alln =1,2,.... We then 
have infinitely many nonzero eigenvalues Aj, Ag,...,A,,.... Clearly 
An ~O0asn — o. In fact, the sequence {9,} converges weakly to zero, 
like any sequence of orthonormal vectors (why?), and hence the se- 
quence {A¢,} converges to zero in norm, so that ||A¢,|| — 0 and hence 
lAnPnll = |A,| +0. Let E,, be the subspace of H generated by all the 
eigenvectors 9), Pu,.-- 5 Pn»---»Le., the set of all linear combinations 
of the form 


2d on Pn» 
and let 
Ei, =HOE, =f" E', 
n=1 
If E., = {0}, then H = E,, and x obviously has a representation of the 
form (17) with x’ = 0 (so that Ax’ = O trivially). If Ei, ~ 0, let x be any 
nonzero element of E%,. Then 
|(Ax, x)] < |Anl lll? 
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for alln =1,2,...,and hence (Ax, x) =0 on £%,. It follows from 
Lemma 2 that A maps the whole space EZ, into the zero vector. The rest 
of the proof is the same as in Case 1, where (18) follows from (17) by 
the continuity of A. jj 


CoroLLary. Let A be a completely continuous self-adjoint operator 
mapping a Hilbert space H into itself. Then there is an orthonormal 
system {,,} of eigenvectors of A such that every element x & H has a unique 
representation of the form 


0 
x => Cry, 
n=l 
Moreover 
ioe) 
Ax = > AnCnVns 
n=1 
where )y, Ag, ... are the eigenvalues corresponding to vy, Yo,.... 


Proof. Noting that every element of E,, o1 E,, is an eigenvector of A 
corresponding to the eigenvalue 4 = 0, let {),} consist of the ortho- 
normal system {9,} constructed in the proof of Theorem 7, together 
with an arbitrary orthonormal basis in EF, or E,. fj 


Problem 1. Prove that the projection operator of Example 4, p. 222 is 
completely continuous if and only if the subspace A, is finite-dimensional. 
Problem 2. Prove that the operator A mapping the point 
x= Oe Xes tas eipoweey Gy 


Xs Xn 
AGS AS peices ,.- Pel 
(3 20-3 . 


is completely continuous. More generally, suppose 
AX = (Q1X1, ApXq, «++ 5 UgXny +++): 
Under what conditions on the sequence {a,} is A completely continuous? 


into the point 


Hint. Since every bounded set in /, is contained in some closed sphere, 
it is enough to show that the images of spheres are relatively compact. In 
fact, by the linearity of A, it need only be shown that the image of the unit 
sphere is compact. In this regard, recall Example 5, p. 98. 


Problem 3. Let A be the integral operator on C;_, 1; defined by 
Hx) = (Ag) =[* oC») dy. 


Prove that A maps the closed unit sphere in C,_, 1, into a noncompact set. 
Reconcile this with Theorem 1. 
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Hint. Let 
0 if —-l<x<0, 
{ 
nx if O<x<-, 
P(X) = n 
oe | 
1 if -~<x< lil. 
n 
Then 9, € C_1,1)5 || nll = 1 for all n, and 
0 if —l<x<Q, 
ne iro 2ye2 
v(x) = (Ae, x) = (9 n° 
coer! if ee x=, 
2n n 


The sequence {{,,} converges in C,_ 1, to the function 


bee) 0 if -l<x<Q, 
x)= 

x if O0<x<l, 
which, having a discontinuous derivative, cannot be the image under A of 
any function in C;_,.y. 


Problem 4. Let A be a completely continuous operator mapping a 
reflexive Banach space E (e.g. a Hilbert space) into itself. Prove that A maps 
the closed unit sphere in E into a compact set. Reconcile this with the pre- 
ceding problem. 


Hint. Use Theorem 6, p. 205. 
Problem 5, Prove that 


a) A linear combination of completely continuous operators is itself a 
completely continuous operator; 

b) The set @(E, £) of all completely continuous operators mapping a 
Banach space E into itself is a closed subspace of the linear space 
Y(E, E) of all bounded linear operators mapping E£ into E. 


Problem 6. Let @(E, E) and &(E, E) be the same as in the preceding 
problem. Prove that besides being a linear space, #(E, E) is also a ring 
when equipped with the usual operations of addition and multiplication of 
operators. Prove that @(E, E) is a two-sided ideal in #(E, £). 


Comment. By a two-sided ideal in a ring & is meant a subring Yc FZ 
such that ae %, re & implies ar € YH, rac LH. 
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Problem 7. Let ® and A*S* be the same as in the proof of Theorem 4. 
Show that ® is closed and hence compact. Deduce from this that A*S* is 
compact, even though as shown in Problem 3, the image of the closed unit 
sphere under a completely continuous operator need not be compact. 


Problem 8. Discuss the connection between Theorem 4 and the theory of 
Sec. 20.4, in particular Corollary 1’, p. 204. 


Problem 9. Let A be a bounded linear operator mapping a Banach space 
E into itself. Show that if A* is completely continuous, then so is A. 


Problem 10. Prove that a linear operator A mapping a Hilbert space H 
into itself is completely continuous if and only if its adjoint (in the sense 
of Sec. 23.3) is completely continuous. 


Problem 11. Give an example of a completely continuous operator A 
mapping a Hilbert space H into itself, such that A has no eigenvectors. 
Reconcile this with Theorem 7. 


Hint. Let A be the operator in /, such that 


Xe Xn—1 
AX = ACE ay By) = (Oe ES, : 
2 n— | 
Then Ax = Ax implies 
Xx Xn—-1 
AX; = OL Xs Sg Ag SS ig AKG SS as 
2 n—1 


and hence x = 0. 


Comment. This situation differs from the finite-dimensional case, where 
every linear operator (self-adjoint or not) has at least one eigenvector. 


7 


MEASURE 


The concept of the measure u.(£) of a set E is a natural generalization of 
such concepts as 


1) The length /(A) of a line segment A; 

2) The area A(F) of a plane figure F; 

3) The volume V(G) of a space figure G; 

4) The increment 9(b) — ¢(a) of a nondecreasing function ¢(t) over a 
half-open interval [a, 5); 

5) The integral of a nonnegative function over a set on the line or over 
a region in the plane or in space. 


Although the notion of measure first arose in the theory of functions of a 
real variable, it was subsequently used extensively in functional analysis, 
probability theory, the theory of dynamical systems, and other branches 
of mathematics. In Sec. 25 we discuss the measure of plane sets, starting 
from the notion of the area of a rectangle. Measure in general will then 
be studied in Secs. 26 and 27. The reader will easily confirm that the con- 
siderations in Sec. 25 are of a general nature and carry over to the case of 
the more abstract theory without essential changes. 


25. Measure in the Plane 


25.1. Measure of elementary sets. Consider the system / of sets in the 
xy-plane, each defined by one of the inequalities 
a<x<b, Aa Xe); a<x<b, ax<x<b 
254 
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and one of the inequalities 
e<y<d, c<y<d, c<y<d, c<y<d, 


where a, b, c and d are arbitrary real numbers. The sets in will be called 
rectangles. The closed rectangle defined by the inequalities 


a<x<b, ccy<d 


is a rectangle in the usual sense (including its boundary) if a < band c < d, 
a line segment (including its end points) if a = b and c < dor if a< b and 
= d,a point ifa = b, c = d, or even the empty set ifa > bore >d. The 
open rectangle 
a<x<b, c<y<d 


is either a rectangle in the usual sense (without its boundary) if a < b and 
c <d or the empty set if a > b or c >d. Each of the rectangles of the 
remaining types will be called half-open and is an ordinary rectangle minus 
one, two or three sides, a line segment minus one or two end points, or 
possibly the empty set. 

In keeping with the concept of area familiar from elementary geometry, 
we now define the measure of each set in & as follows: 


1) The measure of the empty set equals 0; 
2) The measure of the nonempty rectangle (closed, open or half-open) 
specified by the numbers a, b, c, and d equals 


(6 — a)(d — cc). 


Thus with each rectangle Pe Y we associate a number m(P), called its 
measure, where clearly 


1) m(P) is real and nonnegative; 
2) m(P) is additive in the sense that if 


P= UP,, P,OAP,= @ 
k=] 


then 
m(P) = 2 m(P,). 


Our problem is to define the concept of measure for sets more general than 
rectangles, while preserving these two properties. The first step in this 
direction is to define measure for elementary sets, where by an elementary 
set we mean any set which can be represented in at least one way as a union 
of a finite number of pairwise disjoint rectangles. First we prove 


THEOREM 1. The union, intersection, difference and symmetric 
difference of two elementary sets are again elementary sets. 
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Proof. If 
A=UP ke B=U Q, 
k 
are two elementary sets, then clearly 
AQNB=U(P,9@Q) 

te,1 
is also an elementary set, since each P, ™Q, is obviously either a 
rectangle or the empty set. Moreover, it is easy to see that the difference 
of two rectangles is an elementary set. Hence, subtracting an elementary 
set from a rectangle gives another elementary set (as an intersection of 
elementary sets). Suppose A and B are elementary sets, and let P be a 


rectangle containing both of them (such a rectangle obviously exists). 
It follows from what has just been proved that 


AUB=P— [(P-ANAC-—B)] 
is an elementary set. It is then an easy consequence of the formulas 
A—B=ANQ(P— B), 
AAB=(AUB)— (ANB) 


that the difference and symmetric difference of two elementary sets is 
again an elementary set. J 


Remark. In other words, the system of all elementary sets is a ring &, 
as defined on p. 31. 


We now define measure for elementary sets: 


DEFINITION 1. Given an elementary set A, suppose 
A = U Prs 
k 


where the P,, are pairwise disjoint rectangles. Then by the measure of A, 
denoted by m(A), is meant the number 


m(A) = >, m(P,), (1) 
k 
where m(P,) is the measure of the rectangle P. 
Remark. Clearly, m(A) is nonnegative and additive. Moreover, in defining 
m(A), we have tacitly relied on the fact that the sum (1) does not depend on 
how A is represented as a union of sets. To verify this, suppose 


A=Up,=U0Q, 
k L 


where P,, and Q, are rectangles such that 
POP; = 2, 9; 09; = 2 (G/J). 
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Since the intersection P, M Q, of two rectangles is itself a rectangle, it follows 
from the additivity of the measure of rectangles that 
> m(P,) =X m(P, 1 O,) = X mQ,). 
k kyl l 


THEOREM 2. If A is an elementary set and {A,,} is a finite or countable 
system of elementary sets such that 


AcUA,, 
then . 
mA) < > m(A,). (2) 


Proof. Given any « > 0, there is a closed elementary set A contained 
in A and satisfying the condition 


m(A) > m(A) — : ; 


In fact, to get A we need only replace each of the k rectangles P; making 
up A by a closed rectangle contained in P; of area no less than 


€ 
ME es 
Ce, 
Moreover, for each A,, there is clearly an open elementary set A, contain- 
ing A, and satisfying the condition 
ey re € 
m(A,) < m(A,) T ont 
Obviously, 
Ac UAZ,. 


Hence, by the Heine-Borel theorem (recall p. 92), there is a finite 
system 4,,,...,A,, covering A, where 


mi A) < > m(A,,,), 
i=1 
since otherwise A would be covered by a finite number of rectangles of 
total area less than m(A), which is impossible. Therefore 


< 3 mA, + £< DimA,) + § 


E 
2 Oial 


& 
qntl 


< > mA, +> +5 = Limi(A,) +6, 


nr 


which implies (2), since « > 0 is arbitrary. Jj 
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25.2. Lebesgue measure of plane sets. Elementary sets are, of course, far 
from being the most general plane sets considered in geometry and analysis. 
Thus we naturally arrive at the problem of extending the concept of measure 
(while preserving its basic properties) to sets more general than finite unions 
of rectangles with sides parallel to the coordinate axes. This problem is 
solved in a definitive way by Lebesgue’s theory of measure, in which we 
consider countably infinite unions of rectangles, as well as finite unions. 
To avoid sets of “‘infinite measure,’’ we restrict our discussion to subsets 
of the closed unit square E, defined by the inequalities 


0<x<l, O<y<l 
(this restriction is dropped in Remarks 2 and 3, p. 267). 


DEFINITION 2. By the outer measure of a set A <— E is meant the 
number 


u*(A) = inf > m(P,), 
AcCUP, k 
where the greatest lower bound is taken over all coverings of A by a finite 
or countable system of rectangles P,,. 


DEFINITION 3. By the inner measure of a set A — E is meant the 
number 
(A) = 1 — p*(E — A). 


THEOREM 3. The inequality 


a (4) < 2*(A) 
holds for any set A © E. 


Proof. Suppose 
Uy (A) > w*(A), 


u*(A) + p*(E — A) <l. 


Then, by the definition of a greatest lower bound, there are systems of 
rectangles {P,;} and {Q,} covering A and E — A, respectively, such that 


> m(P;) + > m(Q,) < 1. 
j k 
Let {R,} denote the union of the systems {P,;} and {Q,}. Then 
Ec UR,, 
U 


1.€., 


while 
; 
contrary to Theorem 2. Jj 
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DEFINITION 4. A set A is said to be (Lebesgue) measurable if 
(A) = 2*(A), 
i.e., if its inner and outer measures coincide. 


DEFINITION 5. [fa set A is measurable, the number u(A) equal to the 
common value of ,,(A) and u.*(A) is called the (Lebesgue) measure of A. 


For outer measure, we have the following analogue of Theorem 2: 


THEOREM 4. If A is any set and {A,} is a finite or countable system of 
sets such that 
Ac UA,, 
then " 
w"(A) < 3 U"(An). (2’) 


Proof. Given any ¢ > 0, for each A,, there is a finite or countable 
system of rectangles {P,,,} such that 


A,cUP,, 
b 


and 
> M(P px) < u*(A,) se = ] 


k 
by the definition of outer measure. Then 
AcCUUP,, 
and es 
WA) < DE (Py) < DuX(A,) + ¢ 
nm ok n 
which implies (2’), since « > 0 is arbitrary. Jj 


COROLLARY. Jf A is any measurable set and {A,} is a finite or count- 
able system of measurable sets such that 


Ac UA,, 
then ; 
uA) < > ulA,). (2”) 
Proof. Merely replace u* by win (2’). jj 


Next we show that the Lebesgue measure of an elementary set coincides 
with its measure as previously defined: 


THEOREM 5. Every elementary set A < Eis measurable, with Lebesgue 
measure (A) equal to the measure m(A) introduced in Definition 1. 
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Proof. Suppose A is the union of the pairwise disjoint rectangles 
Pei ig Po Vhen 
k 
m(A) ao m(P;), 
j= 
by Definition 1. Therefore, since the rectangles P,,..., P, obviously 
cover A, 


u*(A) < 2 m(P;) = m(A), (3) 


j 
by Definition 2. Moreover, if {Q;} is any finite or countable system of 
rectangles covering A, we have 


m(A) < X, m(Q;) 


m(A) < u*(A), (4) 
by Definition 2 again. Comparing (3) and (4), we get 
m(A) = w*(A). 


Now E — A is also an elementary set, and hence 


by Theorem 2, and hence 


m(E — A) = u*(E — A). 
But 
m(E — A) = 1—m(A), 


while 
u*(E — A) = 1— yy (A). 


It follows that 
m(A) = 24 (A), 
and hence 
m(A) = vp, (A) = w*(A). | 


COROLLARY. Theorem 2 is a special case of Theorem 4. 
Proof. Merely replace u* by m in (2’) or w by min (2”). Jj 
Lemma. The inequality 
u*(A) — w*(B)I < w*(A A B) (5) 
holds for any two sets A and B. 


Proof. Since 
Ac BU(AAB) 


it follows from Theorem 4 that 
w*(A) < u*(B) + u*(4 A B). (6) 
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This implies (5) if u*(4) > u*(B). If u*(4) < w*(B), we deduce (5) 
from the inequality 
w*(B) < w*(A) + u*(A A B) 


obtained by interchanging the roles of A and B in (6). J 


THEOREM 6. A set A is measurable if and only if, given any ¢ > 0, 
there is an elementary set B such that 


u*(A AB) <e. (7) 


Proof. Suppose that given any « > Q, there is an elementary set B 
such that (7) holds. Then, by the lemma, 


|u*(A) — w*(B) = In*(A) — mB) <e, (8) 
and similarly 
|u*(Z — A) — mE — B)| <e, (9) 
since 
(E—A)A(E—B)=AAB. 


Bearing in mind that 
m(B) + mE — B) = m(&) = 1, 
we deduce from (8) and (9) that 


|u*(A) — w*(B — A) — 1] < 2e, 
and hence that 
u*(A) + w*(E — A) = 1, (10) 


since « >0O is arbitrary. But then u,(4) =u*(A), so that A is 
measurable. 

Conversely, suppose A is measurable, i.e., suppose (10) holds. Then, 
given any « > 0, there are systems of rectangles {B,} and {C,} covering 
A and E — A, respectively, such that 
X m(B,) < uA) +2, (11) 


n 


> m(C,) < w*(E — A) + 5 (12) 


n 


Moreover, since > m(B,,) < oo, there is an N such that 


13 


> m(B,) < a 
n>N 3 
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We now show that (7) holds for the elementary set 


N 
B=UB.,. 
n=1 
Clearly, the set 
P—UB, 
n>N 
contains A — B, while the set 
Q=U(BaC,) 
contains B — A, and hence . 
AABCPUQ. (13) 
Moreover, 
uP) < 5 m(B,) < = (14) 


To estimate u*(Q), we note that 


(U B, U (U (C= B) 2 


and hence 
> m(B,) + ¥ mC, — B)> 1. (15) 
But (11) and (12) imply 
E m(B,) +E m(C,) < uA) + wME— A+ Sa 14+S. (16) 


Subtracting (15) from (16), we get 


n n nT 


2 
uw") <7. (17) 
Finally, comparing (13), (16) and (17), we find that 
(A AB) < uX(P UO) < u*(P)+u*(Q)<e. I 


THEOREM 7. The union and intersection of a finite number of measurable 
sets are again measurable sets. 


Proof. It is enough to prove the theorem for two sets. Thus suppose 
A, and A, are measurable sets. Then, by Theorem 6, there are elementary 
sets B, and B, such that 


€ E 
B*(A, A By) < >? u*(Ae A Ba) < 5° 
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Since 
(A, U A.) A (By U Ba) S (Ay A By) U (A, A Ba), 
we have 
u*[(A, U Ag) A (By U B)] < u*(Ay A By) + w* (Ap A Be) <e. 


But B, U B, is an elementary set, and hence A, U A, is measurable, by 
Theorem 6 again. Moreover, a set A is measurable if and only if 


u*(A) + w*(E — A) = 1, 


and hence if A is measurable, so is EF — A. Therefore the measurability 
of A, A A, follows from that of A, U A, and the formula 


COROLLARY. The difference and symmetric difference of two measur- 
able sets are again measurable sets. 
Proof. An immediate consequence of Theorem 7 and the formulas 
A, — A, = A, N(E— A), 
A, A A, = (A, — A) U (A, — Ay). OJ 
THEOREM 8. If A,,..., Ay are pairwise disjoint measurable sets, then 
N N 
u/ U4, = 5 ¥(4,). 


Proof. As in the proof of Theorem 7, we need only consider the case 
n = 2. By Theorem 6, given any ¢ > 0, there are elementary sets B, 
and B, such that 


u*(A, A By) <e, u.*(As A Be) <e. (18) 
Let 
A =A, VU Ab, B= B,U B&B. 
Then A is measurable, by Theorem 7. Since A, and A, are disjoint, we 
have 
By OB, © (Ay A By) U (A, A 2B), 
and hence | 
(By, O By) < 2. (19) 
Moreover, it follows from (18) and the lemma on p. 260 that 
|771(By) — w*(Ay)| <e, [71(Bz) — w*(AQ)| <e. (20) 


Since measure is additive on elementary sets, it follows from (19) and 
(20) that 


m(B) = m(B,) + m(B,) — m(B, OB.) > w*(A;) + 2*(A2) — 4. 
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Noting also that 
AA BC (A, A B,) U (4g A Ba), 
we have 


u*(A) > M(B) — p*(A A B) > m(B) — 26 > w*(Ay) + (Ay) — 6c. 


Therefore 
u*(A) > w*(Ay) + w*(Ad), (21) 


since ¢ > 0 can be made arbitrarily small. On the other hand, it follows 
from A = A, U A, and Theorem 4 that 


w*(A) < w*(A,) + L* (Ad). (22) 
Comparing (21) and (22), we get 

u*(A) = w*(Ay) + w* (AQ), 
where u.* can be replaced by w, since A,, A,, and A are measurable. § 


THEOREM 9. The union and intersection of a countable number of 
measurable sets are again measurable sets. 


Proof. Given a countable system of measurable sets {A,}, let 


A=UA4,, 
n=1 
and let 
nm—l 


Aj=A, AL =A,-UA, (n=2,3,...). 
k=1 


Then the sets A’, are pairwise disjoint, and 


A=UA’. 


n=1 
By Theorem 7 and its corollary, the sets A), are all measurable. More- 
over, by Theorems 4 and 8, 


N N 
> u(4,) = mI U 4, < p*(A) 
for every N= 1,2,... . Therefore the series 
2 UAy) 


converges, and hence, givenany « > 0, there is an integer v > 0 such that 


dua <5. (23) 
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Since the set 


c=U4z’ 


n=l1 

is measurable, being the union of a finite number of measurable sets, 
there is an elementary set B such that 

u*(C A B) < - (24) 
Moreover, since 

AABC(CAB)U (U4.), 
t>yv 

it follows from (23) and (24) that 

u*(A A B) <e. 


Therefore A is measurable, by Theorem 6. Finally, since complements of 
measurable sets are themselves measurable, the intersection 


is measurable. §f 


Theorem 9 generalizes Theorem 7 to the case of a countable number of 
measurable sets. The corresponding generalization of Theorem 8 is given by 


THEOREM 10. If Ay, As,...,A,,... are pairwise disjoint measurable 
sets, then 


0 fee) 
H( U 4.) = u(A,). (25) 
Proof. Let 
A=UA, 
n=l 
Then, since 
N 
UA,cA 
n=1 


for every N = 1, 2,..., it follows from Theorem 8 and the corollary to 
Theorem 4 that 


Su, = mt Ua) Si. 


Taking the limit as N —~ , we get 


Sul.) < p(A). (26) 


266 MEASURE CHAP, 7 
On the other hand, since obviously 


Ac UA,, 


n=1 


it follows from the same corollary that 


WA) < SulAn). (27) 
Comparing (26) and (27), we get 

w(A) = 3 uf An), 
or equivalently (25). §j 


The key property of the measure u expressed by (25) is described by 
saying that p. is countably additive or o-additive. 


THEOREM 11. Let {A,} be a sequence of measurable sets which is 
decreasing in the sense that 


A, > Ag>**:2A,> + 


Then 
lim w(A,) = uA), (28) 
where 
A=Nf)A, 
n=l 


Proof. We need only consider the case A = @, to which the general 
case reduces if A,, is replaced by A,, — A. Clearly 
A, = (Ay — Ay) U (Ag — Az) U's —, 
and 
A, oa (A, ~ Anis) UY (Ass -_ An4+2) eres 


Therefore, by the o-additivity of u, 


u(A,) = Sos — Anis) (29) 
and 
u(A,,) = 2 U(A;, — Az+1). (30) 


Since the series (29) converges, its remainder (30) approaches 0 as n — ov. 
It follows that 


limy(A,)=0=4(S). Ff 


tm— © 
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COROLLARY. Let {A,} be a sequence of measurable sets which is in- 
creasing in the sense that 


A, SC A,gG*** CA, Ce 


Then 
lim u(A,) = u(A), (28) 
where 
A=UA, 
n=1 


Proof. Apply Theorem 11 to the complements of the sets A,. jf 


The property of the measure uw expressed by (28) and (28’) is described 
by saying that uw is continuous. 


Remark 1. To recapitulate, starting from a measure m defined on the 
class “ of all rectangles (with sides parallel to the coordinate axes), we 
have succeeded in extending m first to a measure m defined on the larger 
class .%, of all elementary sets and then to a Lebesgue measure wu defined 
on the still larger class %, of all measurable sets. The class % is closed 
under the operations of taking countable unions and intersections. Moreover, 
the measure pu is c-additive on %. 


Remark 2. So far we have required all our sets to be subsets of the closed 
unit square 
E={(x,yi0<cx<1O0O<y< 1. 


It is easy to get rid of this restriction. For example, representing the whole 
plane as the union of the squares 
Em =(x%, yim<x<cm+i,n<cy<n-t I}, 


where m and n are arbitrary integers, we say that a plane set A is measurable 
if its intersection A,,, = A QE», with every square E,,,,, is measurable as 
previously defined and if the series 


> (Amn) 


m,n 


converges. The measure of A is then defined as 


(A) — > U(Amn)- (31) 


THT 


All the properties of measure proved above carry over to this more general 
case in a straightforward way (give the details). 


Remark 3. We might go still further, calling a set A measurable with 
“infinite measure’ if every A,,,, is measurable and if the series (31) diverges. 
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Alternatively, we can regard the whole plane as the union of the squares 
E,, = {(%, y)i-n < x <n, —n < y < n}, 


calling a plane set measurable, with (possibly infinite) measure 
u(A) = lim u(A,) (32) 


if its intersection A, = AOE, with every square E,, is measurable as 
previously defined. As an exercise, prove the consistency of (31) and (32). 


Problem 1, Let E be the closed unit square. Prove that 


a) Every open subset of E is measurable; 

b) Every closed subset of E is measurable; 

c) Every set obtained from open and closed subsets of E by forming no 
more than a countable number of unions, intersections and com- 
plements is measurable. 


Comment. There are measurable subsets of E which are not of the type c). 


Problem 2. Construct a theory of Lebesgue measure for sets on the line, 
starting from intervals (closed, open and half-open) instead of rectangles. 
Do the same for 


a) Sets on the circumference of a circle; 
b) Three-dimensional sets; 
c) Sets in R”. 


Problem 3. Prove that the set of all rational points on the line is measur- 
able, with measure zero. 


Problem 4. Prove that the Cantor set constructed in Example 4, p. 52 
is measurable, with measure zero. 


Problem 5. Prove that every set of positive measure in the interval [0, 1] 
contains a pair of points whose distance apart is a rational number. 


Problem 6. Show that the power of the set of all measurable subsets of 
the interval [0, 1] is greater than the power of the continuum. 


Problem 7. Let C be a circle of circumference 1, and let « be an irrational 
number. Let all points of C which can be obtained from each other by 
rotating C through an angle nan (where n is any integer, positive, negative 
or zero) be assigned to the same class. (Clearly, each such class contains 
countably many points.) Let ®, be any set containing one point from each 
class. Prove that ®, is nonmeasurable. 


SEC, 26 GENERAL MEASURE THEORY 269 


Hint. Let ®, be the set obtained by rotating ®, through the angle nar. 
Then 


c=U @,, 


N=—-@ 


and 
0,, 9, = © (m 4 n). 


If ®, were measurable, the congruent sets ®,, would also be measurable. 
This would imply 

> u(®,) = 1, (33) 
by the o-additivity of u. But congruent sets must have the same measure, 
i.e., if D, were measurable, then 


u(®,) a (Dp), 
which contradicts (33). 


26. General Measure Theory 


26.1. Measure on a semiring. In Sec. 25 we constructed a theory of 
measure of plane sets, starting from a measure (area) m defined on the class 
SF, of all rectangles (with sides parallel to the coordinate axes) and then 
extending m to a Lebesgue measure yu, defined on the much larger class %, 
of all measurable sets. The explicit formula for the area of a rectangle played 
no role in this construction. In fact, a moment’s thought shows that we only 
used the following properties of the set function m: 


1) The domain of definition %, of m, i.e., the class of all rectangles, 
is a semiring;? 

2) m is real and nonnegative; 

3) m is additive in the sense that if P is a rectangle such that 


Pp =UP,, 


k=1 


where P,,...,P,, are pairwise disjoint rectangles, then 


m(P) = ¥m(P,). 


As will be shown in this section and the next, the construction given in 
Sec. 25 for the case of plane sets can be carried out in an abstract setting, 
whose very generality greatly enhances its range of applicability. 


1 We now draw freely from the material in Sec. 4, on systems of sets. 
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Guided by the above properties of m, we introduce 
DEFINITION |. A set function u(A) is called a measure if 


1) The domain of definition S, of vis a semiring; 
2) w is real and nonnegative; 
3) uw is additive in the sense that if A is a set in S, such that 


nr 
A= UA,, 
k=l 
where A,,..., A, are pairwise disjoint sets in S,,, then 


uA) =2 u.(A;). 
Remark. It follows from @ = @ U @ that 


u(S) = 2u(2), 
and hence 


u(o) = 0, 


THEOREM 1. Let u. be a measure on a semiring 4, and suppose the 
sets A, Ay,...,A,, where Ay,..., A, are disjoint subsets of A, all belong 
to SF. Then 


eS (A;,) < (A). 
Proof. By Lemma 1, p. 33, there is a finite expansion 


A=UA, (s > n) 


k=1 


with A,,..., A, as its first n terms, where 
A,éE J, A, ANA, = @ (k Al) 
for allk,/=1,2,... . Hence 


> (Ay) < > u(Ag) = uA), 
k=1 k=1 
since u is nonnegative and additive. §j 


THEOREM 2. Let be a measure on a semiring 4, and suppose the 


sets A, Ay,..., A, all belong to SF, and satisfy the condition 


k=1 
Then 


nN 


u(A) < > u(A,). 


k=1 
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Proof. According to Lemma 2, p. 33, there is a finite system of 
pairwise disjoint sets B,,..., B, belonging to , such that each of the 


sets A, A,,...,A, has a finite expansion 
A=UB, A,=UB, (k=1,...,n) 
sel seMy 


with respect to certain of the sets B,, where each index s € M, belongs to 
at least one of the sets M;, (recall footnote 16, p. 33). Hence each term 
in the sum 


> vu(B,) 


seMo 


appears at least once in the double sum 


> > vw) 
k=1 s€AT, 
It follows that 


n 


uA) = ¥ uB)<¥ F oB)=Solay. 0 


se Mo k=1 seal; 
Corotuary, If A © A’, then p(A) < u(A’). 
Proof. Choosen=1. § 


It will be recalled that the first step in constructing Lebesgue measure of 
plane sets was to extend measure from rectangles to elementary sets, i.e., to 
finite unions of disjoint rectangles. We now consider the abstract analogue 
of this process: 


DEFINITION 2. A measure u. is called an extension of a measure m if 
Sn © SF, and w(A) = m(A) for every Ae &,,. 


THEOREM 3. Any measure m defined on a semiring S,, has a unique 
extension w defined on the ring B(S,), i.e., the minimal ring generated 
by S,. 


Proof. By Theorem 3, p. 34, every set Ac &(.X%,) has a finite 
expansion . 
A= UB, (1) 


k=1 


where the sets B,,..., B, are pairwise disjoint and belong to %,. Let 


(A) = Sm(B,) (2) 


Then pw is obviously real, nonnegative and additive. Moreover, the 
quantity 2(A) defined by (2) is independent of the expansion (1). In fact, 
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suppose A has another expansion of the form 
A = U C,, (1°) 
l=1 


where the sets C,,... , C, are pairwise disjoint and belong to %,. Then, 
since the intersections B, © C, all belong to -%,, it follows from the 
additivity of the measure m that 


§ 


2,m(B,) ae m(B, O C;) ae m(C,), 
and hence : — - 


p3 m(C,) = 2(A), 


as asserted. This proves the existence of the extension yu. To prove the 
uniqueness of u, suppose m has another extension w’, and let A be the 
set (1). Then, by the additivity of w’, 


(A) = Se'(B) = 3 m(B,) = 2A). 


Hence, since every set Ac #(&,) has a representation of the form (1), 
the extensions yu and yw’ coincide. ff 


Remark. As already noted, the proof of Theorem 3 is a repetition in 
abstract language of the extension of measure from the semiring of rectangles 
to the minimal ring generated by this semiring, 1.e., the class of elementary 
sets. 


26.2. Countably additive measures. Many problems in analysis involve 
unions of countably many sets, as well as unions of only finitely many sets. 
Correspondingly, the (finite) additivity imposed on measures in Definition 1 
turns out to be inadequate, and it is natural to introduce a stronger kind 
of additivity: 


DEFINITION 2. A measure u. with domain of definition F, is said to be 
countably additive or o-additive if 


u(A) = Dud, 
for all sets A, Ay,..., An)... € &, satisfying the conditions 


A=UA,, 4,N4;=2 #£x(i¥)j). 


n=1 


Example. According to Theorem 10, p. 265, Lebesgue measure in the 
plane is o-additive. 
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THEOREM 4. Suppose a c-additive measure m on a semiring S,, is 
extended to a measure p. on the ring A(X). Then v. is also o-additive. 


Proof. Suppose 
AER Sn), Bre A(Fn) =1,2,...) 


and 
A= UB,, 
n=1 
where 


B, OB, = © (k #1). 
Then, by Theorem 3, p. 34, there exist finite expansions 
A=UA4,, B= |) B35; 
3 a 
where 
A, VA,= 2, Bir OB, = DS (kK Al). 


Let 
Cag = Bai a) A;. 


Then the sets C’,,; are pairwise disjoint and 


A, =U Cis: 
n t 


By = U Cris: 
Therefore 
mA,) = > M Cri) (3) 
mB) = M(Crz3)s (4) 


3 


since m is c-additive on -%,, and moreover 


u(A) = > m(A,), (5) 


3 


u(B,) = >, m(B,,), (6) 


a 


by the definition of the measure u. Comparing (3)—(6), we find that 
u(A) = Dd, m(A,) = > > > M(Cris) = Dd, > MBni) = > (Bn) 


j 


(the sums over i and j are finite, while those over n are convergent). ff 


274 MEASURE CHAP. 7 


Next we generalize Theorems | and 2 to the case of o-additive measures: 


THEOREM 1’. Let wu be a c-additive measure on a semiring S,, and 
suppose the sets A, Ay,..., Ay,..., where Ay,..., Ay... are pairwise 
disjoint subsets of A, all belong to S. Then 


Yu(Ap) < pA). (7) 
k=1 
Proof. By Theorem 1, 


Satay < u(A) 


for alln = 1,2,... . Taking the limit as n > «, we get (7). fj 


THEOREM 2’. Let w be a c-additive measure on a semiring 4, and 
suppose the sets A, Ay,...,Ax,..-. all belong to SF, and satisfy the 
condition 


AcUA, 
Then iat 
w(A) < > e(A,). (8) 


Proof. By Theorem 4, we can assume that pu is defined on the ring 
A(S,), instead of just on the semiring %,. In fact, if » is c-additive, 
so is its extension on A(X), which we continue to denote by p., and the 
validity of (8) on &(.%,) obviously implies its validity on %. The sets 


n-1 
k=1 
belong to A(.Y4,) and clearly satisfy the conditions 


A=UB,, B,< A, B.AB,= 2 (k<¢)). 
n=1 


Therefore 
n=1 n=1 


Problem I. Let X = {x1, Xg, ...} be any countable set, and let p,, po, . 
be positive numbers such that | 
> Pn = 1. 


n=1 


On the set , of all subsets of X, define a measure u by the formula 
WA)= Sp, (ASX), 


where the sum is over all n such that x, ¢ A. Prove that u is a o-additive 
measure, with u(X) = 1. 
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Comment. This kind of measure arises quite naturally in many problems 
of probability theory. 


Problem 2. Let X be the set of all rational points in the closed unit 
interval [0,1], and let %, be the set of all intersections of the set X with 
arbitrary closed, open and half-open subintervals of [0,1], including the 
degenerate closed intervals consisting of a single point. Prove that 4 is a 
semiring. Define a measure » on %, by the formula 


(A ap) = b — a, 
where A,, is the intersection of X with any of the intervals [a, 5], (a, 5), 


(a, 5), [a, 6). Prove that p is additive, but not c-additive. 


Hint. Although w(X) = 1, X is a countable union of single-element sets, 
each of measure zero. 


Problem 3. Let u. be a measure which is additive, but not a-additive. 
Prove that 


a) Theorem 1’ continues to hold for yp; 
b) Theorem 2’ fails to hold for u. 


Hint. Use Problem 2. 


Problem 4, Given a measure u on a semiring %,, suppose 


w(A) < Salas) 


whenever the sets A, A,,...,A,,... all belong to %, and satisfy the 
condition 


Ac UA,. 
k=1 
Prove that u is o-additive. 


Comment. It is often easier to verify that u has this property than to 
prove the o-additivity of u directly. 


27. Extensions of Measures 


Any measure m defined on a semiring .%,, can be extended to a measure 
defined on the ring #(-%,), i.e., the minimal ring generated by %,. How- 
ever, if m 1s o-additive, we can extend m to a measure defined on a much 
larger class of sets than #(%,). This is done by the abstract analogue of 
the procedure used in Sec. 25.2 to construct Lebesgue measure in the plane. 
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Assuming that %, has a unit,” we begin with the analogues of Definitions 
2-5, pp. 259-260. 


DEFINITION |. Let m be a o-additive measure on a semiring S,, with 
a unit E. Then by the outer measure of a set A < E is meant the number 


u*(A) = inf > m(B,), 
Buk 


AcU 
k 


where the greatest lower bound is taken over all coverings of A by a finite 
or countable system of sets B, € Sy. 


DEFINITION 2. By the inner measure of a set A < E is meant the 
number 


(A) = m(E) — u*(E — A). 
Remark. By the exact analogue of Theorem 3, p. 258, it follows that 
(A) < 2*(A). 
DEFINTION 3. A set A is said to be (Lebesgue) measurable if 
4(A) = 2*(A), 
l.e., if its inner and outer measures coincide. 


DEFINITION 4. Ifa set A is measurable, the number (A) equal to the 
common value of u.,,(A) and w*(A) is called the Lebesgue measure of A.* 


Remark. Clearly, a set A © E is measurable if and only if 
u*(A) + w*(E — A) = m(E). (1) 
In particular, it follows from (1) that if A is measurable, so is E — A. 


THEOREM 1. If A is any set and {A,} is any finite or countable system 
of sets such that 


Ac UA,, 
then is 
wt(A) < Du*(A,). 


Proof. Exactly analogous to that of Theorem 4, p. 259. jj 


*The case where %, fails to have a unit will be discussed later (after Theorem 7). 

3 It turns out, of course, that uw is a measure as defined in Sec. 26.1 (see Theorem 5, 
where the additivity of uw is proved). In particular, this justifies the use of the notation 
SH, for the system of all measurable sets. 
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THEOREM 2. Every set AG A(Y,) is measurable, with Lebesgue 
measure equal to m(A), where m is the extension of m from the semiring 


S,, to the ring BR LF,,). 
Proof. Exactly analogous to that of Theorem 5, p. 259. § 


THEOREM 3. A set A is measurable if and only if, given any « > 0, 
there isa set BE &(F,) such that 


u*(A A B) <e. 


Proof. Exactly analogous to that of Theorem 6, p. 261. § 
THEOREM 4. The system FS, of all measurable sets is a ring. 


Proof. Exactly analogous to that of Theorem 7, p. 262 and its 
corollary. § 


Remark. Obviously E is the unit of /,, so that % is an algebra of 
sets (see p. 31). 


THEOREM 5. The set function w(A) is additive on J. 

Proof. Exactly analogous to that of Theorem 8, p. 263. § 
THEOREM 6. The set function (A) is o-additive on F,,. 
Proof. Exactly analogous to that of Theorem 10, p. 265. Jj 


Remark. Thus p. is a c-additive measure of the system -Y, of all measur- 
able sets. This measure is called the Lebesgue extension of the original 


measure m. 


THEOREM 7. The system ¥, of all measurable sets is a Borel algebra 
with unit E. 


Proof. Recall from p. 35 that a Borel algebra is closed under the 
operations of taking countable unions and intersections. The proof is 
the exact analogue of that of Theorem 9, p. 264. J 


It is interesting to note that an arbitrary measurable set can be approxi- 
mated to within a set of measure zero by a set of a very special kind: 


THEOREM 8. Given any set Ac &, there are sets 
Baz € BS np) (Bay © Brg Ce? © By, C+) 
and corresponding sets 


B,=UB,,¢€ F& (B, > Bg >°++ > B, >) 
k 


u 
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such that 
Ac B=f)B,, 
u(A) = u(B). 
Proof. Given any n, we can cover A by a union 
C,= U Aur 


of sets A,,, € 4, such that 
i 
u(C,) < w(A) + 


Let 
n 
B,, = NM) Cas 
k=1 
so that, in particular, B, > B,>-+-: > B, >---. Then it is easy to 
see that 
B,= U5, 
$s 
where 5,,€ 4,. Next let 
k 
Bur = U Ones 
s= 
so that, in particular, 
B, = UB,,. 
k 


Then obviously B,,¢€ A(%,) and By - Bye C++ OC By Cc: 
Moreover 


Ac B=N)B,, 


n 


since B is an intersection of sets containing A. It follows that 
w(A) < p(B). (2) 
On the other hand, B< B, < C, for every n, and therefore 


(B) < W(B,) < wlC,) < BA) + ~ 
Taking the limit as n — oo, we get 
1(B) < (A), 
which, together with (2), implies u(A) = u(B). 


Our construction of the Lebesgue extension of a measure m defined on a 
semiring %, must be modified somewhat if %, fails to have a unit. We 
continue to use Definition 1 to define the outer measure w*, but u* is now 
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defined only on the system -¥,. of all sets with coverings 
U B, (B, € F,,) 
k 


such that 

> m(B,) < ©. 

k 
Since Definition 2 is meaningless in the absence of a unit, we now define 
measurable sets by using the property figuring in Theorem 3: 


DEFINITION 3’. A set A is said to be (Lebesgue) measurable if, given 
any « > 0, there isa set Be &(Y,) such that p*(A A B) <«. 


DEFINITION 4’. If a set A is measurable, the number w(A) equal to 
its outer measure \.*(A) is called the (Lebesgue) measure of A. 


Remark. Note that Definitions 3’ and 4’ are equivalent to Definitions 3 
and 4if Y, has a unit. 


In the case where “%, has no unit, Theorems 4-6 continue to hold, since 
the proofs of Theorems 5 and 6 do not require %, to have a unit, while the 
proof of Theorem 4 can easily be freed of this requirement (see Problem 4). 
However, Theorem 7 now takes a new form (see Problem 5). As before, the 
o-additive measure » on the system -¥ of all measurable sets is called the 
Lebesgue extension of the original measure m. 


Remark. There is an interesting analogy between the construction of the 
Lebesgue extension of a measure m defined on a semiring -%,, and the process 
of completing a metric space. Let m be the extension of m from the semiring 
SF, to the ring A(X%,), and suppose we regard (A A B) as the distance 
between the elements A, Be &(X%,). Then B(Y%,) becomes a metric space 
(in general, incomplete), whose completion, according to Theorem 3, is just 
the system -, of all Lebesgue-measurable sets. However, note that from a 
metric point of view, two sets A, Be Y, are indistinguishable if w(A A B) = 0. 


Problem 1. Let m be a o-additive measure on a semiring %, with a unit 
E, let uw be the Lebesgue extension of m, and let @ be an arbitrary o-additive 
extension of m. Prove that u(A) = (A) for every measurable set A on 
which @ is defined. 


Hint. First show that u,(A) < 2(A) < w*(A). 


Problem 2. Let m be the same as in the preceding problem, and let m™ be 
the extension of m to a measure defined on A(X). Prove that the outer 
measure of a set A © E is given by 


w*(A) = inf Yn(BYs 
ACUB, k 


k 
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where the greatest lower bound is taken over all coverings of A by a finite 
or countable system of sets B, € 2(Y,,). 


Problem 3. State and prove the analogues of Theorem 11, p. 266 and its 
corollary for an arbitrary o-additive measure u defined on a Borel algebra 
SF, with unit E. 


Problem 4. Give a proof of Theorem 7 valid in the case where % fails 
to have a unit. 


Hint. Suppose A,, A, € Y%,. Then A, U A,€ %, by the same proof as 
before (cf. p. 262). Moreover, there are sets B,, B, € A(.F,,) such that 


wd, OB) <5,  wX(dy A Bs) <5. 
But 


(A, — A,) A (B, — Bo) < (Ay A By) U (AQ A Ba), 


and hence u*(A A B) < e where B = B, — B,€ &(Y,). Therefore A, — A> 
e &,.. To prove that A, O A, and A, A A, belong to &%, use the formulas 


A, 1 A, = A, — (Ai — Ag), 
A, A A, = (A; — Ay) U (Ag — Aj). 


Problem 5. Given a measure m on a semiring -%, with no unit, let up 
be the Lebesgue extension of m and S%, the corresponding system of all 
measurable sets. Prove that 


a) SF is a 8-ring (see p. 35); 
b) The set 
k 


belongs to -%, if and only if there is a constant C > 0 such that 


n 
»(UA 7 <C (3) 
k=1 
for all # = 1, 234s 
Comment. The necessity of the condition (3) is obvious, since our 
measures are always finite. 


Problem 6. Let » and 4 be the same as in the preceding problem. 
Prove that the system of all sets Be 4, which are subsets of a fixed set 
Ae & isa Borel algebra with unit A. 


Problem 7. A measure u is said to be complete if every subset of a set 
of measure zero is measurable, i.e., if A’ < A, (A) =0 implies A’ e %. 
(If A’ ec %, then obviously »(A’) = 0.) Prove that the Lebesgue extension 
of any measure m is complete. 
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Hint. If A’ < A and (A) = 0, then u*(A’) = 0. But @ € A(X) and 
u*(A’ A B) = p*(A’) = 0. 

Problem 8. Let m be a measure defined on a ring &. For example, m 
might be the extension of a measure m originally defined on a semiring %, 
to a measure defined on the minimal ring # = A(X.) generated by &%.. 


Then a set A is said to be Jordan measurable if, given any « > 0, there are 
sets A’, A” € & such that 


A-C ACA’. m(A” — A’) <e. 


Prove that the system &* of all Jordan-measurable sets is a ring containing 
R. 


Problem 9. Let m, # and &* be the same as in the preceding problem, 
and let . be the system of all sets A such that there is a set B € Z& containing 
A. Given any set A € &, let 


u(A) = inf m(B), 
BDA 
Bef 
u(A) = sup m(B) 
7 BCA 
Bek 
(since @ < A, A always contains a set in &). Prove that 
a) w(A) < 2 (A); 
b) The ring #* coincides with the system of all sets A €.W for which 
(4) = v(A); 
c) If 
Ac UA,, 
k=1 
where A, A;,..., A, all belong to .~, then 


a(4) < >B(A,); 


k=1 


d) If A,,..., A, are pairwise disjoint sets contained in a set A, then 


WA) > ¥ ws) 


By the Jordan measure of a set A € #*, we mean the number u(A) equal to 


the common value of u(A) and u(A). Prove that pis a measure on Z* = &. 


Comment. The measure wy is called the Jordan extension of the measure 
m. If mis itself an extension of a measure m originally defined on a semiring 
SF ,we write Z* = Z*(F,) and call uw the Jordan extension of the measure 
m, as well as of the “‘intermediate’’ measure m. 
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Problem 10. Given two measures m, and m, defined on rings #, and &,, 
let u, and y, be their Jordan extensions onto the larger rings AT = SF, and 
AX = F,,. Prove that u, and pe coincide if and only if 


A, o F,,, mA) = p,(A) for all Ac A,, 
R,=< FL, mA) = y,(A) for all A 6 &. 


Oe 

Problem 11. Let m be the measure defined in Sec. 25.1 on the ring # of 
all elementary sets (i.e., all finite unions of disjoint rectangles with sides 
parallel to the coordinate axes), and let uw be the Jordan extension of m. 
Prove that wu does not depend on the particular choice of the underlying 
rectangular coordinate system. In other words, prove that u (as well as 
the corresponding ring #* = SX) does not change if all the sets in & are 
subjected to the same shift and rigid rotation. 


Problem 12. We say that a set A is a set of uniqueness for a measure m if 


1) There is an extension of m defined on A; 
2) If uw, and yp. are two such extensions, then u,(A) = u,(A). 


Prove that the system of sets of uniqueness of a measure m defined on a 
semiring .%, coincides with the ring Z* = #*(Y,) of sets which are Jordan 
measurable (with respect to m). In other words, prove that the Jordan ex- 
tension of a measure m originally defined on a semiring %, is the unique 
extension of m to a measure defined on #* = #*(YF,), but that the 
extension of m to a larger system is no longer unique. 


Problem 13. Prove that if a set A is Jordan measurable, then 


a) A is Lebesgue measurable; 
b) The Jordan and Lebesgue measures of A coincide. 


Prove that every Jordan extension of a c-additive measure is o-additive. 
Problem 14. Give an example of a set which is Lebesgue measurable, but 
not Jordan measurable. 


Problem 15. We say that a set A is a Set of o-uniqueness for a c-additive 
measure m if 

1) There is a o-additive extension of m defined on A; 

2) If yw, and wp. are two such extensions, then y,(A) = ue(A). 
Prove that the system of sets of o-uniqueness of a o-additive measure m 
defined on a semiring %, coincides with the system of sets which are 
Lebesgue measurable (with respect to m). 


Hint. To show that every Lebesgue-measurable set A is a set of o- 
uniqueness for m, choose any « > 0. Then there is a set BE 2 = &(F,) 
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such that u*(A A B) <e. If y is any extension of m defined on A (and on 
&R), then u(B) = m(B), where m is the unique extension of m onto &. 
Moreover, u(A A B)< w*(A A B)<e, and hence |u(A) — m(B)| < «. 
Therefore |u,(A) — u.(A)| < 2 if uw, and wu. are two o-additive extensions 
of m defined on A (and on &). Hence y,(A) = u.(A), by the arbitrariness 
of ec. 


Problem 16. Let m be a o-additive measure defined on a semiring %,, 
and let # be the domain of the Lebesgue extension of m. Let m’ be a o- 
additive extension of m to a semiring -%,, such that 


5 i a ae 


and let &’ be the domain of the Lebesgue extension of m’. Prove that 


Glee: L.. 


8 


INTEGRATION 


28. Measurable Functions 


28.1. Basic properties of measurable functions. Given any two sets X and 
Y, let Y be a system of subsets of X and #’ a system of subsets of Y. Then 
an abstract function y = f(x) defined on X and taking values in Y is said 
to be (SY, S’)-measurable if Ac SF’ implies f-1(A) € SF. 


Example. Let X and Y both be the real line R4, so that y = f(x) is a 
“function of a real variable.’ Moreover, let Y and S’ both be the system 
of all open (or closed) subsets of R'. Then our definition of measurability 
reduces to that of continuity (recall Sec. 9.6). On the other hand, if we 
choose both Y and #’ to be the system #! of all Borel sets on the real line 
(recall p. 36), our definition becomes that of a Borel-measurable (or simply 
B-measurable) function. 


In what follows, we will be primarily concerned with the notion of real 
functions measurable with respect to some underlying measure yu, this being 
the case of greatest interest from the standpoint of integration theory. More 
exactly, let X be any set and Y the real line R', with Y = SY the domain of 
definition of some o-additive measure uw and ’ the system 4! of all Borel 
sets B < R'. For simplicity, we assume that , has a unit equal to X itself. 
Moreover, since any o-additive measure can be extended onto a Borel algebra 
(by Theorem 7, p. 277), we might as well assume from the outset that 4, 
is a Borel algebra. These considerations suggest 


DEFINITION 1. Given a o-additive measure uv. defined on a Borel algebra 
S, of subsets of a set X, where X is the unit of F,, let y = f(x) be a real 
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function defined on X, and let B' be the set of all Borel sets on the real 
line. Then the function f is said to be u-measurable (on X) if f(A) € F, 
for every Ac &", or equivalently if f-"(B) < &,. 

THEOREM 1. A function f is w-measurable if and only if the set 
{x:f(x) < c} is u-measurable (i.e., belongs to F,) for every real c. 


Proof. If fis u-measurable, then obviously so is {x:f(x) < c}, since 
(— «,c)is a Borel set. Conversely, let & be the system of all semi-infinite 
intervals (—0oo, c), and suppose f-(Z) < 4. Since A(x), the Borel 
closure of & (see p. 36), coincides with the system # of all Borel sets 
on the line (why ?), we have 


fMUB) =f'UB4Q)) = ASFA) © BSA) 


(recall Problem 3e, p. 36). But BS) = F£,, since SF, is a Borel 
algebra, and hence 


fUB)SOS, I 


THEOREM 2. Let {f,,} be a sequence of u-measurable functions on X, 
and let f be a function on X such that 


f(x) = lim f°) 
for every x € X. Then f is itself u-measurable. 
Proof. First we verify that 


{x:f(x)<ec} = UU a fae) <e— 7 (1) 


nmon 


In fact, if f(x) < c, there is an integer k > 0 such that 


2 
A aad 


and then for this k, there is an integer n > 0 so large that 


frlX) <e— - (2) 


for all m > n. Therefore every x belonging to the left-hand side of (1) 

also belongs to the right-hand side. Conversely, if x belongs to the 

right-hand side of (1), there is a k such that (2) holds for all sufficiently 

large m. But then f(x) < c, i.e., x belongs to the left-hand side of (1). 
Now, since the functions f,, are u-measurable, the sets 


1 
x finl(xX) << ce— = 
Se 
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all belong to %, and hence so does the right-hand side of (1), since , 
is a Borel algebra. Therefore {x:f(x) <c}e %. But then f is p- 
measurable, by Theorem 1. J 


THEOREM 3. A B-measurable function of a w-measurable function is 
itself -measurable. 


Proof. Let f(x) = »[b(x)], where ¢ is B-measurable and ¢ is p- 
measurable. If A < R! is any B-measurable set, then its preimage A’ = 
go 1(A) is B-measurable, and hence the preimage A” = (1(4’) is p- 
measurable. But A” = f~1(A), and hence fis u-measurable. § 


COROLLARY. A continuous function of a w-measurable function is 
itself u-measurable, 


Proof. A continuous function is clearly B-measurable. J 


28.2. Simple functions. Algebraic operations on measurable functions. 
A function f is said to be simple if it is u-measurable and takes no more 
than countably many distinct values. This notion clearly depends on the 
choice of the measure u.. 


The structure of simple functions is clarified by 


THEOREM 4. A function f taking no more than countably many distinct 
values y,, Y2,... is w-measurable if and only if the sets 


A, = {x:f(x) = yz} (n = 1,2;...) 
are u-measurable, 


Proof. Since each single-element set {y,} is a Borel set, the set A,, 
being the preimage of {y,,}, is measurable if fis measurable.’ Conversely, 
suppose the sets A,, are all measurable. Then the preimage f—1(B) of any 
Borel set B < R'is measurable, being a union 

U A, 

¥neB 
of no more than countably many measurable sets A,. But then f is 
measurable. J 


The relation between measurable functions and simple functions is shown by 


THEOREM 5. A function f is u-measurable if and only if it can be 
represented as the limit of a uniformly convergent sequence of simple 
functions. 


1 For simplicity, we often say “‘measurable’’ instead of ‘‘u-measurable,” omitting 
explicit reference to the underlying measure u. 
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Proof. If fis the (uniform) limit of a convergent sequence of simple 
functions, then fis u-measurable by Theorem 2, since simple functions 
are u-measurable by definition. Conversely, given any yu-measurable 
function f, let 


fix) =~ if 


where m and n are positive integers. Then the functions f, are simple 
and moreover converge uniformly to fas n — oo, since 


LO) —fACOl<=. I 


The next few theorems show that the class of measurable functions is 
closed under the usual algebraic operations. 


THEOREM 6. If f and g are measurable, then so is f + g. 


Proof. First let f and g be simple functions, taking value y,, yo, .. 
and Z,, Z.,..., respectively. Then the sum 4 = f+ g can only take the 
values c,; = y; + 2;, where each such value is taken on a set of the form 

{x:h(x) = ¢,;} — YU ({x f(x) = yah O (x: 8(x) = 2;}). (3) 
it 2g= Ci; 
There are no more than countably many values w of the function A = 
f+g, and moreover each set {x:h(x) = c,;} is measurable, since the 
right-hand side of (3) is clearly measurable. Therefore h = f+ g is a 
simple function. 

Now let f and g be arbitrary measurable functions, and let {7} and 
{g,} be sequences of simple functions converging uniformly to f and g, 
respectively, as in the proof of Theorem 5. Then the sequence of simple 
functions {f, + g,} converges uniformly to f+ g, and hence f+ g is 
measurable, by Theorem 5. J 


THEOREM 7. If f is measurable, then so is cf, where c is an arbitrary 
constant. 


Proof. Obviously, the product of a simple function and a constant is 
again simple. But if {/,} is a sequence of simple functions converging 
uniformly to f, then {cf,} converges uniformly to cf, and hence ef is 
measurable, by Theorem 5. § 


THEOREM 8. If f and g are measurable, then so is f — g. 
Proof. An immediate consequence of Theorems 6 and 7. § 


THEOREM 9. If f and g are measurable, then so is fg. 
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Proof. Clearly, 
I 
fe =z10 + 8 — V8)" 


But the expression on the right is a measurable function, by Theorems 
6-8 and the fact that the square of a measurable function is measurable 
(this follows from the corollary to Theorem 3). § 


THEOREM 10. If f is measurable, then so is \[f, provided f does not 
vanish. 


Proof. We have 


wie < | = [f) > - U {x:f(x) < 0} 
if c > 0, 
ifec < 0, and [Foy <e] 7 Pg SLO) <9} 


om 
Me Sh =e 
| f(x) a 
if c = 0. But in each case the set on the right is measurable. J 


CorOLLaRY. If f and g are measurable, then so is f/g, provided g does 
not vanish. 


Proof. An immediate consequence of Theorems 9 and 10. J 


28.3. Equivalent functions. The values of a function can often be ne- 
glected on a set of measure zero. This suggests 


DEFINITION 2. Two functions f and g defined on the same set are said 
to be equivalent (with respect to a measure 2) if 


uixif (x) # gx} = 0. 


A property is said to hold almost everywhere (on £) if it holds at all points 
(of E) except possibly on a set of measure zero. Thus two functions f and g 
are said to be equivalent (written f~ g) if they coincide almost everywhere. 


THEOREM 11. Given two functions f and g continuous on an interval E, 
suppose f and g are equivalent (with respect to Lebesgue measure wy on the 
line). Then f and g coincide. 


Proof. Suppose f(x) 4 g(x) at some point X € E, so that f(x») — 
2(xXo) A 0. Since f— g is continuous, there is a neighborhood of x, 
(possibly one-sided) in which f — g is nonzero. This neighborhood has 
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positive measure, and hence 


{xi f(x) A g(x)} > 0, 
le., f and g cannot be equivalent, contrary to hypothesis. J 


Remark. Thus two continuous functions cannot be equivalent if they 
differ at even a single point. However, discontinuous functions can obviously 
be equivalent without being identical. For example, the Dirichlet function 


1 if x is rational, 


f(x) = ; 


if x 1S irrational 


is equivalent to the function g(x) = 0 (recall Problem 3, p. 268). 


THEOREM 12. A function f equivalent to a measurable function g is 
itself measurable. 


Proof. It follows from Definition 2 that the sets {x:f(x) < c} and 
{x:g(x) < c} can differ only by a set of measure zero. Hence if the second 
set is measurable, so is the first set. The proof is now an immediate 
consequence of Theorem 1. § 


28.4. Convergence almost everywhere. Since the behavior of measurable 
functions on sets of measure zero is often unimportant, it is natural to 
introduce the following generalization of the ordinary notion of convergence 
of a sequence of functions: 


DEFINITION 3. A sequence of functions {f,(x)} defined on a space X 
is said to converge almost everywhere to a function f(x) if 


lim f(x) = f(x) (4) 
for almost all x € X, i.e., if the set of points for which (4) fails to hold is 


of measure zero. 


Example. The sequence {/f,(x)} = {(—x)”} defined on [0, 1] converges 
almost everywhere to the function f(x) = 0, in fact everywhere except at the 
point x = 1. 


Theorem 2 now has the following generalization: 


THEOREM 2’. Let {f,,} be a sequence of u-measurable functions on X, 
and let f be a function on X such that 


f(x) = lim f,(%) (5) 
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almost everywhere on X. Then f is itself p-measurable, provided y. is 
complete.” 


Proof. If A is the set on which (5) holds, then u(X — A) = 0. The 
function fis measurable on A, by Theorem 2, and also on X — A, since 
every function is measurable on a set of measure zero if u is complete 
(why ?). Hence fis measurable on the whole set X = A U(X — A). J 


28.5. Egoroy’s theorem. The following important theorem shows the 
relation between the concepts of convergence almost everywhere and uniform 
convergence: 


THEOREM 12 (Egorov). Let { f,} be a sequence of measurable functions 
converging almost everywhere on a measurable set E to a function f. Then, 
given any 8 > 0, there exists a measurable set E; < E such that 


1) w(Zs) > w(E) — 8; 
2) {f,,} converges uniformly to f on Es. 


Proof. The function f is measurable, by Theorem 2’. Let 
‘i ! 
em =N [x f(x) — FOOD < =f (6) 
Thus, for fixed m and n, E™ is the set of all points x such that 


| f(x) — f(x) < 


= [e 


holds for all i >. Moreover, let 


It follows from (6) that 


m m eb ™m cee 
Ey Cc Es GS CEC 


9 


and hence, by the corollary to Theorem 11, p. 267,° given any m and 
any § > 0, there is an m(m) such that 


) 
uC E™ — Exim) < 5m (7) 
Let 


[e) 
oe m 
Es -— M EF np(m)* 
m=1 


2 See Problem 7, p. 280. 
3 See also Problem 3, p. 280. 


SEC. 28 MEASURABLE FUNCTIONS 29] 


Then £; satisfies the two conditions of the theorem. The fact that the 
sequence {/,,} is uniformly convergent on E; is almost obvious, since if 
x € E,, then, given anym=1,2,..., 


Lica — fog} <4 
m 


for every i > n,(m). 

To verify condition 2), we now estimate the measure of the set E — E,, 
noting first that u(E — E”) = 0 for every m. In fact, if x»6 E — E™, 
then there are arbitrarily large values of i such that 


ile) — fool > —, 
m 


which means that the sequence {/,,} cannot converge to f at the point Xp. 
Therefore u(E — E™) = 0, as asserted, since {/,,} converges to f almost 
everywhere, by hypothesis. It follows from (7) that 


m m ™m ) 
u(E — Exam) ie u(E a. Exg(m)) < am : 
Therefore 


oE — Be) = w(E — NESm) = 4( UE ~ EXm)] 


oe) o § 
S > UE os Enemy) <2. am = 5, 


m=1 
and hence u(E;) > w(E)— 6. 
Problem 1. Prove that the Dirichlet function 


son=| 


0 if x is irrational 


I if x is rational, 


is measurable on every interval [a, 5]. 
Problem 2. Do the same for the function 


a: ifx = Fis rational, 


f(x) = \9 q 


0 if x 1s irrational. 
Problem 3. Suppose f(x) is measurable on [a, 6]. Is g(x) = e”@ measur- 
able on [a, b]? 
Problem 4. Prove that if fis measurable, then so is |/. 
Problem 5. Let {f,,} be a sequence of measurable functions converging 


almost everywhere to a function f’ Prove that {/,} converges almost every- 
where to a function g if and only if fand g are equivalent. 
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Problem 6. A sequence {f,,} of u-measurable functions is said to converge 
in measure to a function / if 


lim fe: /f,(x) — £0] > 8} =0 
for every 5 > 0. Prove that if a sequence {f,} of measurable functions 
converges to f almost everywhere, then it converges to fin measure. 


Hint. Let A be the set (of measure zero) on which {f,} fails to converge 
to f, and let 


E,(8) = {x: | fel) — f()| > 3}, 


R,(3) = VEC), (8) 
M =f)R,(3). 


Then the sets (8) are all measurable (why ?), and u(R,(8)) > u(M) asn— o~, 
since R,(5) > R,(6) > ---. Prove that M@ < A and hence that u(M) = 0 
(as always, we assume that u is complete). It follows that u(R,(8)) > 0 as 
n—» co. Now use the fact that E,,(6) < R,(8). 


Problem 7. Let {f,} be a sequence of measurable functions converging in 
measure to a function f. Prove that {/,,} converges in measure to a function 
g if and only if fand g are equivalent. 


Problem 8. Given any positive integer k, consider the function 
oye bed l 
! {is =e 
fix) = k k 
0 otherwise, 
defined on the half-open interval (0, 1]. Show that the sequence 
FO, ff, 0 fPL Pf 
converges in measure to zero, but does not converge at any point whatsoever. 


Comment. Thus the converse of the proposition in Problem 6 is false. 
Instead we have the weaker proposition considered in the next problem. 


Problem 9. Prove that if a sequence {f,,} of functions converges to f in 
measure, then it contains a subsequence {f, } converging to f almost 
everywhere. 


Hint. Let {8,} be a sequence of positive numbers such that 


lim 8, = 0, 


nro 
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and let {e,,} be a sequence of positive numbers such that 


foe) 
D605 0: 


n=1 


Let {n,} be a sequence of positive integers such that n, > n,_, and 


wletlf,0) —f@Ol> &i<e, (k= 1,2,..). 


Moreover, let 


R= U x: In) —fO1> 3} 2=AR, 


t=] 


Then u(R,) > u(Q) as i— o, since Ry > R, > ---. On the other hand, 


u(R,) < > Ex» 
y=1 


and hence u(R;) > 0, so that u(Q) = 0. Now show that {f,,} converges to 
fonE—Q. 
Problem 10. Prove that a function f defined on a closed interval [a, b] is 


u-measurable if and only if, given any « > 0, there is a continuous function 
© on [a, b] such that u{x:f(x) 4 o(x)} <e«. 


Hint. Use Egorov’s theorem. 


Comment. This result, known as Luzin’s theorem, shows that a measurable 
function “can be made continuous by altering it on a set of arbitrarily small 
measure.”’ 


29. The Lebesgue Integral 


The concept of the Riemann integral, familiar from calculus, applies 
only to functions which are either continuous or else do not have “too many” 
points of discontinuity. Hence we cannot form the Riemann integral of a 
general measurable function f. In fact, f may be discontinuous everywhere, 
or it may even be meaningless to talk about the continuity of fin the case 
where f is defined on an abstract set. For such functions, there is another 
fully developed notion of the integral, due to Lebesgue, which is more 
flexible that the notion of the Riemann integral. 

Let f be a function defined on a closed interval [a,b] of the x-axis. 
Then to form the Riemann integral of f, we divide [a, b] into many sub- 
intervals, thereby grouping together neighboring points of the x-axis. On 
the other hand, as we will see below, the Lebesgue integral is formed by 
grouping together points of the x-axis at which the function f takes neigh- 
boring values. In other words, the key idea of the theory of Lebesgue 
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integration is to partition the range of the function frather than its domain. 
This immediately makes it possible to extend the notion of integral to a very 
large class of functions. 

Another advantage of the Lebesgue integral is that it is constructed in 
exactly the same way for functions defined on an abstract “measure space”’ 
(an arbitrary set X equipped with a measure) as for functions defined on the 
real line. This is to be contrasted with the situation for the Riemann integral, 
which is first introduced for functions of a single real variable and then 
extended, with suitable modifications, to the case of functions of several 
real variables, but fails to make any sense at all for functions defined on an 
abstract measure space. 

In what follows, unless the contrary is explicitly stated, we will consider 
a o-additive measure yv. defined on a Borel algebra of subsets of a set _X, 
with X as the unit. We will assume that all sets under consideration are 
u-measurable, and that all functions under consideration are defined and 
u-measurable on X. 


29.1. Definition and basic properties of the Lebesgue integral. Let f be a 
simple function, i.e., a u-measurable function taking no more than countably 
many distinct values 


Vis Vor+++9Vno-+> (1) 
Then by the (Lebesgue) integral of f over the set A, denoted by 
[,f@) de, 
we mean the quantity 
> Ynth(An) (2) 


where 


provided the series (2) is absolutely convergent. If the Lebesgue integral 
of fexists, we say that fis integrable or summable (with respect to the measure 
uw) on the set A. 


Example. Obviously, 


[. ‘du = [au = u(A). 
We now get rid of the restriction that the numbers (1) be distinct: 


LEMMA. Given a simple function f defined on a set A, suppose A is a 
union 
A = U B, 
ke 
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of pairwise disjoint sets B, such that f takes only one value c, on B,. Then 
f is integrable on A if and only if the series 


pa Cy( B;,) (3) 


is absolutely convergent, in which case 


[S09 da = ¥ equ). 
Proof. Each set 
A, = {x:x EA, f(x) = Ya} 


is the union of the sets B, for which c, = y,. Therefore* 


> YalA,) = 2 Yn > v(B,) = > cyu(B,). 


k=Un 


Moreover, since uz is nonnegative, we have 


> [al w(An) = 2 Yn) & w(B,) = > cy] u(B,), 


so that the series (2) is absolutely convergent if and only if the series (3) 
is absolutely convergent. jj 


THEOREM |. Let fand g be simple functions integrable on a set A, and 
let k be any constant. Then f + g and kf are integrable over A, and 


[UC + sl dv = | £0) du + | gd dy, (4) 

[lef] du = kf £0) de. (5) 

Proof. Suppose f takes distinct values y, on sets F, < A, while g 
takes distinct values z; on sets G; © A, where i,j = 1,2,.... Then 

[,f@) du =X yw (Fd, (6) 

[,a) du = ¥ zu(G,). (7) 


Clearly, f+ g takes the values c,; = y; + Zz, (not necessarily distinct) 
on the pairwise disjoint sets B,; = F, A G;. It follows from 


uF) = dul, VG;), u(G,) = > uF; 0 G;) 


‘The notation } calls for the sum over all & such that cz, = ya. 
Ch=Un 
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and the absolute convergence of the series (6) and (7) that the series 
> > C;4(B; 5) => > (ye + 2j)u(F; O G;) 
a j a ¥ 


is absolutely convergent. Hence, by the lemma, f-+ g is integrable on 
A and 


[UFC) + eG du =F O1 + 2DulF, 9G) 
=X yulh) + & zaG). (8) 
Comparing (6)—(8), we get (4). The proof of (5) is trivial. jj 


THEOREM 2. Let fbe a bounded simple function on A, where | f(x)| < M 
if x € A, Then f is integrable on A and 


[ fC9 du| < Mutd). 


Proof. If f takes values y, on sets A, < A (n = 1,2,...), then 


| [ fe) du 


where we have incidentally proved the integrability of fon A (how?). jj 


> Yntt(An) - > [Ynl U(An) < M > u(4,) = Mp(A), 


Next we remove the restriction that f be a simple function: 


DEFINITION. A measurable function f is said to be integrable (or 
summable) on a set A if there exists a sequence { f,} of integrable simple 
functions converging uniformly to f on A. The limit 


lim J ful) de (9) 


is then called the (Lebesgue) integral of f over the set A, denoted by 


[ £0) du. 


This definition relies tacitly on the following conditions being met: 


1) The limit (9) exists (and is finite) for any uniformly convergent sequence 
of integrable simple functions on A; 

2) For any given /, this limit is independent of the choice of the sequence 
{fabs 

3) For simple functions, the definitions of integrability and of the integral 
reduce to those given on p. 294. 
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All these conditions are indeed satisfied. Condition 1) is an immediate 
consequence of the estimate 


[fated der — ff. an | =| f fas) — F500] da 
= u(A) pis | fn) a Firlx)l, 


implied by Theorem I and 2. To prove 2), suppose the sequences {f,} and 
{/*} both converge uniformly to f, but 


tim | f(x) du #lim [, #8@0) dw. 


Let {¢,,} be the sequence 
List istovta eee Se ee ea 


Then {9,} converges uniformly to f, but 


lim it Pr(X) du 


fails to exist, contrary to condition 1). Finally, to prove 3), if fis simple, 
we need only consider the trivial sequence {/,} with general term f, = f- 


THEOREM 1’. Theorem 1 continues to hold if f and g are arbitrary 
measurable functions integrable on A. 


Proof. Animmediate consequence of Theorem 1, after taking suitable 
uniform limits of integrable simple functions. J 


THEOREM 3. If 9 is nonnegative and integrable on A and if | f(x)| < 
(x) almost everywhere on A, then f is also integrable on A and 


[ fda} < J 0d dy. (10) 


Proof. If fand ¢ are simple functions, then, by subtracting a set of 
measure zero from A, we get a set A’ which can be represented as a 
finite or countable union 


A’'=UA, 
of subsets A, < A’ such that 


IX)=4,, x)=), 
for all x € A, and 
la,| < b, (n=); 2 ).2:s), 


Since ¢ is integrable on A, we have 


E lanl a(n) < ¥ Onw(dn) = fo) de = [ody (A) 
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(see Problem 3b). Therefore fis also integrable on A, and 


[£09 au] =| [£69 a 


Comparing (11) and (12), we get (10). 

In the case where f and 9 are arbitrary measurable functions, let 
{f,} and {9,} be sequences of simple functions converging uniformly to f 
and , respectively, constructed in the«same way as in the proof of 
Theorem 5, p. 286. Then clearly 


IPnCOlL< Pn(x) =v =1,2,...) 


on A’. Moreover each 9, is integrable, since ¢ is integrable by hypoth- 
esis. It follows that each f,, and hence f itself is integrable, where 


[UeOdl de < fen) dy. 


Taking the limit as n + o, we again get (10). J 


— 


> a,u(A,,) 


<Q la,lu(A,). (12) 


CoroLiaryY. If fis bounded and measurable on A, then f is integrable 
on A. 


Proof. Choose (x) = M, where 
M= up [f(x]. 


29.2. Some key theorems. We now prove some important properties of 
the Lebesgue integral, regarded as a set function 


F(A) = [, f(x) du (13) 
defined on a system of measurable sets (with the integrand / held fixed). 
THEOREM 4. Let 


A=UA, 


be a finite or countable union of pairwise disjoint sets A,,, and suppose f is 
integrable on A. Then f is integrable on each A,, and 


[fda =>, fe) de, (14) 
where the series on the right is absolutely convergent. 


Proof. First let f be a simple function, taking the values y,, yo, .. 
and let 


B, = {x:x € A, f(x) = y;}, Buy = {x:x €A,, f(x) = yz} 
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Then 
[ fC) du =X yuw(Bs) = ¥ ye YuBya) 
=TIywBw=T If, fOdw. (9) 


Since fis integrable on A, the series > y,u(B,,) converges absolutely, and 


k 
hence so do the other series in (15). (Here we use the nonnegativity of 
the measure uw.) In particular, fis integrable on each set A,. 
Next let fbe an arbitrary measurable function integrable on A. Then, 
given any e > 0, there is a simple function g integrable on A such that 


If) -—gQ)l<e (xe A). (16) 


For g we have 
[e@)du= Xe) dy, (17) 


as just shown, where g is integrable on each A, and the series converges 
absolutely. Hence, by (16), fis also integrable on each A, and 


2 


[,f@) du. = [8 du < 2 eu(A,) = eu(A), 


| [, £0) du — | 2) du 
which, together with (17), implies the absolute convergence of the series 


x |, fe) 


<eu(A), 


and the estimate 


[,f@) du — XJ fe) de | < 2ey(A). (18) 
But (18) implies (14), since « > Ois arbitrary. J 


COROLLARY. If f is integrable on A, then f is integrable on every 
measurable subset A’ — A. 


Proof. Think of A as the union of the disjoint sets A’ and A — A’. ff 


Remark. A succinct way of expressing the property (14) is to say that 
the set function (13) is c-additive. 


THEOREM 5 (Chebyshev’s inequality). If f is nonnegative and integrable 
on A, then 


u{x:x EA, f(x) >ch< *{ fx) du. 
cva 
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Proof. If 
A’ = {x:x € A, f(x) > c}, 
then 
[fo du =] f@dut fo fe) de> Jf) du> ula) 
(see Problem 4a). § 


COROLLARY. If 
[,\FG0| du = 0, 
then f (x) = 0 almost everywhere. 


Proof. By Chebyshev’s inequality, 
1 
ufesx EA, Lf > < nf IfG)I du = 0 
n 


for alln = 1,2,... . Therefore 
~ 1 
u{x:x EA, f(x) 40} < Sufxx E A, | f(x)| > | =0. § 
n=1 n 
THEOREM 6. If fis integrable on a set A, then, given any « > 0, there 


isa sé > 0 such that 


<€ 


{700 a 
for every measurable set E < A of measure less than 8. 


Proof. The proof is immediate if fis bounded, since then 


[f (x) du | < [oo Bt SUD OME) 


(see Problem 4c). In the general case, let 
A, = {x:xEA,n < {[f(x)| <n + 1}, 


N 

By ae U A,, 
n=0 

Cy= A —, By. 


Then, by Theorem 4, 


[Lf de => ff de. 
Let N be such that 


> [lf au = [1/00 da < 5. 


m= 
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and let 


& 


0<§ < ——. 
2(N + 1) 


Then u(£) < 6 implies 


{709 du| = firoolae = roolau + fi LfGL da 


<(N + DuE) + [fl dy < stone I 


OBN 


Remark. The property figuring in Theorem 6 is expressed by saying that 
the set function (13) is absolutely continuous with respect to the measure wu. 
Problem I. Prove that the Dirichlet function 
l if x is rational, 
f(x) = eer ae 
0 if x is irrational 


fails to have a Riemann integral over any interval [a, b]. Prove that the 
Lebesgue integral of f over any measurable set A exists and equals zero. 


Problem 2. Find the Lebesgue integral of the function 


if x =F js rational, 
q 


I 
f(x) = \4 

j if x is irrational 
over the interval [a, 5]. 


Problem 3. Prove that 


a) If fis integrable on a set Z of measure zero, then 


[, FO) du = 0; 
b) If fis integrable on A, then 
[,, $0) du = J fC) dp 
for every subset A’ < A such that u(A — A’) = 0. 
Comment. We can regard a) as a limiting case of Theorem 6. 
Problem 4. Prove that 


a) If fis nonnegative and integrable on A, then 


[, fC) da > 03 
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b) If fand g are integrable on A and f(x) < g(x) almost everywhere, then 


[,f@ du< [,s@ dy; 


c) If fis integrable on A and m < f(x) < M almost everywhere, then 


mu(A) < | fC) du < Mu(A). 


Problem 5. Prove that the existence of either of the integrals 


[fords — f Lfedldu 
implies the existence of the other. 
Problem 6. Let 
A=UA, 


be a finite or countable union of pairwise disjoint sets A,, and suppose f 
is integrable on each A, and satisfies the condition 


YI, UC dp < ow. (19) 


Prove that fis integrable on A. 
Hint. If fis simple, with values y, ye,..., let the sets B, and B,, be 
the same as in the proof of Theorem 4. Then 


[ (C01 du = fled w Burd: 
The absolute convergence of (19) implies the convergence of 


> > Del UBr) = > el > (Bax) = 2 Yel u(B,), 


n k k 


and hence the integrability of fon A. In the general case, let g be a simple 
function approximating f, and show that (19) implies the convergence 


E J, eG dp, 
so that g, and hence /, is integrable on A. 
Comment. This is essentially the converse of Theorem 4. 


Problem 7. Let u be a o-additive measure defined on a Borel algebra -%, 
of subsets of a given set X, and let f be nonnegative and integrable on X 
(with respect to u). Prove that the set function 


F(A) = | f() dp 
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is itself a o-additive measure on .%,, with the property that F(A) =0 
whenever (A) = 0. 


Problem 8. Suppose f is integrable on sets A,, Ao,...,A,,... such that 


A, > Ap? °'' >A, > °+°-, 

and let 

A=MfA, 
Does oe 

i SO) 
converge to 

9 
[fed du 


30. Further Properties of the Lebesgue Integral 


30.1. Passage to the limit in Lebesgue integrals. The problem of taking 
limits behind the integral sign, or equivalently of integrating a convergent 
series term by term, is often encountered in analysis. In the classical theory 
of integration, it is proved that a sufficient condition for taking such a limit 
is that the series (or sequence) in question be uniformly convergent. We 
now examine the corresponding theorems for Lebesgue integrals, which 
constitute a rather far-reaching generalization of their classical counterparts. 


THEOREM | (Lebesgue’s bounded convergence theorem). Let {f,,\ be a 
sequence of functions converging to a limit f on A, and suppose 


n(x) < 9%) = (x €A,n = 1,2,...), 
where ¢ is integrable on A. Then f is integrable on A and 


tim [ faC) du = J G0 dp. 


Proof. Clearly | f(x)| < ¢(x), and hence fis integrable, by Theorem 3, 
p. 297. Let 
A, = {x:k —1< 9x) < k}, 


B,, = U A, = {x: 9(x) > m}. 
k2m 
By Theorem 4, p. 298, 
J,2@) = 5], o) du, (1) 
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where the series on the right is absolutely convergent. By the same token, 


Jo) au =X J oC) de. 


Given any <« > 0, there is an integer m such that 


& 
[9 dy. < 5 ’ 


since the series (1) converges. Moreover, 9(x) << m on A — B,. By 
Egorov’s theorem (Theorem 12, p. 290), A — B,, can be represented in 
the form 


A— B, =C UD, 
where {f,,} converges uniformly to fon C and 


u(D)<—. 
Sm 
Let N be such that 


f(x) — f(x)| < iO 
on Cifn> WN. Then 


[UC) ~ FON da = fp fal da — J, PO) da + fp fale) de 
7 fe I) du + [Lac —~— f(x)] dy, 


and hence 


[, fe) — |S) de | = | [ Lac) —£@)1 dp | 
< [Wald + fda + [LCI du 
+ | fede + J IC) — 001 du 
eee aan =e. 
which implies (1), since « > O is arbitrary. J 
Corotiary. If | f,(x)| < M and f, > f, then 


tim [ faC2) du = | fd. 


Proof. Choose 9(x) = M, noting that every constant is integrable 
onA. § 
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Remark. The values taken by a function on a set of measure zero have 
no effect on its integral. Hence in Theorem | we need only assume that {f,} 
converges to f almost everywhere and that the inequality |f,(x)| < 9(x) 
holds almost everywhere. 


THEOREM 2 (Levi). Suppose 
Aix) < fil) <0 < f(x) <--> 


on a set A, where the functions f,, are all integrable and 
[, fn) du < M (n=1,2,...) (2) 
for some constant M. Then the limit 
f(x) = lim f,(x) 


exists (and is finite) almost everywhere on A.*> Moreover, f is integrable 
and 


tim ffl) du =], $2) dt. 


Proof. It can be assumed that f(x) > 0, since otherwise we need 
only replace the f, by f, —/f{. Let 


Q = {x:x EA, f(x) > oo}. 


Q=NU Q*’, 


Then clearly 


where 
OQ”) = {x:x 6 A, f(x) > ry}. 


It follows from (2) and Chebyshev’s inequality (Theorem 5, p. 299) that 


war <4. 
r 
Moreover 
¥(U a9 | ae 
n r 
since 
QM QM cee. OY e aeaive 
But 
Q-¢ YU Q” 


’ The function f can be defined in an arbitrary way on the set E where the limit (2) 
fails to exist, for example, by setting f(x) = 0 on E. 
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for any r, and hence 

M 
u(Q)<—. 
r 
Since r can be arbitrarily large, this implies 


thereby showing that the sequence {/,(x)} has a finite limit f(x) for 
almost all x € A. 
Now let 
A, = {x:r —1 < f(x) <7}, 


and let be the simple function such that 


o(x) =rifxeEA, (22 ae 
Moreover, let 


$ 
B, = UA,. 
r=1 
Since the functions f,, and fare bounded on B, and since 


o(x) < f(x) + 1, 


we have 
f,.0@) du < J fe) du + oA) 
= lim i fA) du + u(A) < M 4+ 2(A), 


where we use the corollary to Theorem 1. But 


[,,002) du = > ru 4,) 


and hence 
Sra.) < M + uA) 
for alls =1,2,... . Therefore 
H(A) < 0, 


l.e., @ is integrable on A, with integral] 


ie (x) dy -> ru(A,). 


Since f,(x) < (x), the validity of (3) is now an immediate consequence 
of Lebesgue’s bounded convergence theorem (Theorem 1). J 
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Coro_iary. If 9,(x) > 0 and 
x) du < o, 
2 [oul ) dy 
then the series 
by (x) 
k=l 


converges almost everywhere on A and 


> [en dy = i A >a) di. 


Proof. Apply Theorem 2 to the functions 


fi) = > ml). 


THEOREM 3 (Fatou). Let {f,} be a sequence of nonnegative functions 
integrable on a set A, such that 


[ fodde<M  (n=1,2,...). 


Suppose {f,,} converges almost everywhere on A to a function f. Then f is 
integrable on A and 
[ f@) du <M. 
Proof. Let 

n(x) = se fil). 

Then 9, is measurable, since 
{x:9,(x) < c} =U f(x) < c}. 

Moreover 

0< 9,(x) < f,(), 
and hence 9, is integrable, by Theorem 3, p. 297, with 


3 Pn(X) dh < [fo du<M (n=1,2,...). 
Clearly 
1(x) < $2(X) <7 < OX) <i , 


lim ©,(x) = f(x) 


and 


almost everywhere. Applying Theorem 2 to the sequence {¢,}, we find 
that fis integrable and 


J, FC) du =lim J encdu <M. I 
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30.2. The Lebesgue integral over a set of infinite measure. So far all our 
measures have been finite (except for Remark 3, p. 267), and hence everything 
said about the Lebesgue integral and its properties has been tacitly understood 
to apply only to the case of functions defined on sets of finite measure. 
However, one often deals with functions defined on a set X of infinite measure, 
for example, the real line equipped with ordinary Lebesgue measure. We 
will confine ourselves to the case of greatest practical interest, where X can 
be represented as a union 


x=UxX,, wWX,)< 0 (3) 


of countably many sets X,, each of finite measure with respect to some 
o-additive measure u defined on a o-ring of subsets of X (the sets of finite 
measure). Such a measure is called o-finite. For example, Lebesgue measure 
on the line, in the plane, or more generally in n-space is o-finite. For 
simplicity, and without loss of generality (why?), we will assume that the 
sequence {X,,} is increasing, j.e., that 


Kes Xe Se re SG KG Cee, (4) 


A sequence {X,,} satisfying the conditions (3) and (4) will be called exhaustive. 
For example, the sequence {£,,} in Remark 3, p. 267 is an exhaustive sequence 
(with respect to ordinary Lebesgue measure), whose union is the whole 
plane. 

Now let f be a measurable function on X.° Then fis said to be integrable 
(or summable) on X if it is integrable on every measurable subset A ¢ X¥ 
and if the limit 


tim J. f) de (5) 


exists (and is finite) for every exhaustive sequence {X,,}. The limit (5) is then 
called the (Lebesgue) integral of f over the set X, denoted by 


[£09 dp. 


Remark 1. The limit (5) is independent of the choice of the exhaustive 
sequence {X,,}. In fact, suppose 


lim | f@)duAlim |, f(x) dy, 


® A real function y = f(x) is now said to be measurable if the set f(A) 1 X, is 
measurable for every X, and every Borel set A (this being the obvious slight generalization 
of Definition 1, p. 284). 
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where {X*} is another exhaustive sequence. Define a new sequence {Q,,} 
such that 


Q, = Mas 
Q,, is any set of {X*} containing Q.,_,, 
Q.,41 is any set of {X,} containing Q., 


(why do such sets exist ?). Then {Q,,} is exhaustive, but 


lim s f(x) du 
fails to exist, contrary to hypothesis. 


Remark 2. The integral of a simple function is defined in the same way 
as on p. 294. It is clear that a necessary (but not sufficient) condition for 
integrability of a simple function fis that f take every nonzero value on a set 
of finite measure. 


30.3. The Lebesgue integral vs. the Riemann integral. Finally we examine 
the relation between the Lebesgue integral and the Riemann integral, 
restricting ourselves to the case of ordinary Lebesgue measure on the line: 


THEOREM 4. If the Riemann integral 


I= |” fe) dx 


exists, then f is Lebesgue integrable on [a, b] and 


[SO de =H, (6) 


Proof. Introducing the points of subdivision 
m= a +5 (b—a) (hese dy i427), 


we partition [a, b] into 2” subintervals. Let 


2” R=1 
8, — : n : > Mnk 
y k=1 


be the corresponding Darboux sums, where M,,, is the least upper bound 
and m,,,, the greatest lower bound on fon the subinterval x,.. <x < X,. 
By the definition of the Riemann integral, 

I=lmA, =lim6,. 


a> oO NCO 
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Consider the functions 
FAX) = May if x,1 <x <X,, 
Fil) = Mak if Xyy ix ox, 
f,() = f,(0) =f) 
Then clearly 


F(x)dp=An fe falx) du = 8. (7) 


[a,b] 


AQ eh S oS i,@ SSI), 
Si) < f2(x) aaa < fn(X) <--: < f(x), 


Moreover, 


and hence 
lim FiAx) = f(x) > f(x), 
lim f,(x) = f(x) < f@). 


Using (7) and Theorem 2, we find that 
lea I (x) du. a mee (x) dy. = lim 4, J 


—lim 3, =lim I yf) Me = fe fdy (8) 


n> oO 


(see also Problem 2). Therefore 


Joon 7) —LOD du = J (16) — $0} du = 0, 


ab} 
and hence ; 
fix) — f(x) = 0 
almost everywhere, by the corollary on p. 300. In other words, 
4) =f@ =f@ (9) 


almost everywhere. Comparing (8) and (9), we get (6). Jj 
Problem 1. Prove that 


lim ful)80) du = J fe) daca) 


if the sequence {/,} satisfies the conditions of Theorem 1 (as stated miore 
generally in the remark on p. 305) and if g is essentially bounded on A in 
the sense that there is a constant M > 0 such that |g(x)| < © almost every- 
where on A. 
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Comment. If g is essentially bounded on A, then the quantity 


ess sup [a(x)| = int { sup tal} 
re A ZCA | xrEA-Z 
u(Z)=0 
called the essential supremum of g on A, is finite. 


Problem 2. Prove that Theorem 2 remains valid if 
AQ) > h(t) >-°->f,Qx) >°-°° 
and if (2) is replaced by the condition 


[ f@)du>M @=1,2..), 


Problem 3. Consider the system / of all subsets of the real line con- 
taining only finitely many points, and let the measure u(A) of a set AC SF 
be defined as the number of points in A. Prove that 


a) & is a ring without a unit; 
b) uw is not o-finite. 


Problem 4. Why do we talk about a o-ring rather than a o-algebra on 
p. 308? 


Problem 5. Prove that if a function f vanishes outside a set of finite 
measure, then its Lebesgue integral as defined on p. 308 coincides with its 
Lebesgue integral as previously defined. 


Problem 6. Show that the analogue of the definition on p. 296 cannot be 
used to define the Lebesgue integral in the case where A is of infinite measure. 


Hint. Give an example of a uniformly convergent sequence {/,} of 
integrable simple functions such that 


tim J, fue) du 
fails to exist. 


Problem 7. Which of the theorems of Sec. 29 continue to hold for 
integrals over sets of infinite measure? 


Hint. The corollary on p. 298 fails if A is of infinite measure. 


Problem 8. Verify that Theorems 1-3 of Sec. 30.1 continue to hold for 
integrals over sets of infinite measure. 


Problem 9. Given a nonnegative function f, suppose the Riemann integral 


. f(x) dx 
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exists for every « > 0 and approaches a finite limit as e + 0+, so that the 
improper Riemann integral 


ih f(x) dx = lim i. F(x) dx (10) 


exists. Prove that fis Lebesgue integrable on [a, 5] and 


les f(x) du = ih I(x) dx. 


Comment. On the other hand, if fis of variable sign and if 


lim |? x)| dx = oo, 
tim PG) 


then the Lebesgue integral of f over [a, 5] fails to exist, even if the improper 
Riemann integral (10) exists. In fact, by Problem 5, p. 302, summability 
of f would imply that of | /]. 


Problem 10. Prove that the integral 
| ee sin 2 dx 
9x xX 
exists as an improper Riemann integral, but not as a Lebesgue integral. 


Problem 11. Suppose f is Riemann integrable over an infinite interval 
(such an integral can exist only in the improper sense). Prove that f is 
Lebesgue integrable over the same interval if and only if the improper 
integral converges absolutely. 


Comment. For example, the function 
sin x 
Cs leas 
x 

is not Lebesgue integrable over (— 00, 00), since 
| o 
—o 
On the other hand, fhas an improper Riemann integral equal to 


ie sin x 
ee = TT 
—o x 


sin x 
——|dx = oo. 


x 
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DIFFERENTIATION 


Let f be a summable function defined on a space X, equipped with a 
o-additive measure u. Then the (Lebesgue) integral 


[£09 du (1) 


exists for every measurable E < X, thereby defining a set function on the 
system .“, of all measurable subsets of X. If X is the real line, equipped 
with ordinary Lebesgue measure u, and if E = [a, b] is a closed interval, we 
write (1) simply as 


[2 #09 ax, 


or equivalently as 
Pf ai (2) 


in terms of the new dummy variable of integration ¢ (here we anticipate 
subsequent notational convenience). Then (2) is clearly a function of the 
lower limit of integration a and the upper limit of integration b. Suppose we 
fix a, but leave b variable, indicating this by replacing b by the symbol x. 
Then (2) reduces to the “indefinite Lebesgue integral’ 


Jr at, 


with its upper limit of integration variable. 
Now let f be continuous, and let F have a continuous derivative. Then 
it will be recalled from elementary calculus that the connection between 
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the operations of differentiation and integration is expressed by the familiar 
formulas 


d fx 
= J FO dt =f), (3) 


[°F dt = FO) — F(a). (4) 
This immediately suggests two questions: 


1) Does (3) continue to hold for an arbitrary summable function f? 
2) What is the largest class of functions for which (4) holds? 


These questions will be answered in Secs. 31-33. The study of the general 
set function (1) will be resumed in Sec. 34. 


31. Differentiation of the Indefinite Lebesgue Integral 


31.1. Basic properties of monotonic functions. We begin our study of the 
indefinite Lebesgue integral 


F(x) = J*f@ at (1) 


as a function of its upper limit by making the following obvious but important 
observation. If f is nonnegative, then (1) 1s a nondecreasing function. 
Moreover, since every summable function f(¢) is the difference 


fO=f.O —-LM 


of two nonnegative summable functions (which?), the integral (1) is 
the difference between two nondecreasing functions. Hence, the study of the 
Lebesgue integral as a function of its upper limit is closely related to the 
study of monotonic functions. Monotonic functions are interesting in their 
own right, and have a number of simple and important properties which 
we now discuss. Here all functions will be regarded as defined on some 
fixed interval [a, b] unless the contrary is explicitly stated. 


DEFINITION 1, A function f is said to be nondecreasing if x, < Xx, 
implies f (x1) < f(x) and nonincreasing if x, < x, implies f(x.) > f (x2). 
By a monotonic function is meant a function which is either nondecreasing 
or nonincreasing. 


DEFINITION 2. Given any function f, the limit 


lim f(x9 + ¢) 


e—0 
e>0 
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(provided it exists) is called the right-hand limit of f at the point Xp, 
denoted by 

I (%o + 9). 
Similarly, the limit 


lim f(%9 — &) 


e->0 
e>0 


is called the left-hand limit of f at x9, denoted by 
I (%o — 9). 


f(%o + 0) =f(% — 9), 


then clearly f is either continuous at x) or has a removable discontinuity 
at Xp. 


Remark. If 


DEFINITION 3. A function f is said to be continuous from the right at 


Xo if 
F (Xo) =f (Xo + 9), 


and continuous from the left at xy if 


I (%o) = f(%o — 9). 


DEFINITION 4. By a discontinuity point of the first kind of a function f 
is meant a point Xo at which the limits f (x9 + 0) and f(x» — 0) exist but are 
unequal. The difference 


f(%o + 9) — f(%o — 9) 
is then called the jump of f at Xp. 


Example. Given no more than countably many points 


Nis kay aslica gas 
in the interval [a, 5], let 
|e ee (Sree 


be corresponding positive numbers such that 


dha < 0. 
Then the function fe 
f (x) = Z h,, (2) 


where the sum is over all n such that x, < x, is obviously nondecreasing. 
A monotonic function of this particularly simple type is called a jump 
function. A jump function such that 


Myo Xe Ng Ss 
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is called a step function. For an example of a jump function which is not 
a step function, see Problem 1. 


We now establish the basic properties of monotonic functions. To be 
explicit, we will talk about nondecreasing functions, but clearly everything 
carries over automatically to the case of nonincreasing functions. 


THEOREM 1. Every nondecreasing function f on [a, b] is measurable 
and bounded, and hence summable.} 


Proof. Since f(x) < f(6) for all x € [a, 5], f is obviously bounded. 

Consider the set 
E, = {x:f(x) < c}. 

If £, is-empty, then £, is (trivially) measurable. If EZ, is nonempty, let 
d be the least upper bound of all x ¢ E,. Then E£, is either the closed 
interval [c, d], if de E,, or the half-open interval [a, d) if d¢ E,. In 
either case, E, is measurable. §f 

THEOREM 2. Every discontinuity point of a nondecreasing function is 
of the first kind. 

Proof. Let x be any point of [a 5], and let {x,} be any sequence 
such that x, < Xo,X, —>X9. Then {/(x,)} is a nondecreasing sequence 
bounded from above, e.g., by the number f(x). Therefore eh 1 fn ) 


exists for any such sequence, i.e., f(x» — 0) exists. The existence of 
f(%o + 9) is proved in the same way. fj 
Obviously, a nondecreasing function need not be continuous. However, 
we have 
THEOREM 3. A nondecreasing function can have no more than countably 
many points of discontinuity. 
Proof. The sum of the jumps of fon the interval [a, b] cannot exceed 


f(b) —f(@. Let J, be the set of all jumps greater than 1/n, and let J be 
the set of all jumps regardless of size. Then obviously 


J=U/,, 


n=1 
where each J, is a finite set. Hence J has no more than countably many 
elements. § 
THEOREM 4. The jump function (2) is continuous from the left. More- 
over, all the discontinuity points of f are of the first kind, with the jump at x, 
equal to h,,. 


1 See the corollary on p. 298. 
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Proof. Clearly, 
f(x -—0)= lim f(x —c)=lim > A,. 


6-0 &n<a—-e 
230 e>0 
But if x, < x, then x, < x — « for sufficiently small « > 0. Therefore 
lim > h,=/f(x), 


670 Yn< ae 
e>0 


f(x — 0) =f). 
If x coincides with one of the points x, say with x,,, then 


Sf (Xn + 9) = lim fn +e)=lim > A, a his 


6-0 @n,< Ingte tnSang 


I On ar 0) —f (Xn, -* 0) = h,,,: i 


THEOREM 5. If f is continuous from the left and nondecreasing, then 
f is the sum of a continuous nondecreasing function » and a jump func- 
tion wb. 


and hence 


which implies 


Proof. If x1, X2.... are the discontinuity points of f, with corre- 
sponding jumps hy, h,,... , let 
p(x) = > hy 
Ln<% 


e(x) = f(x) — YX). 
p(x”) — o(x’) = LA") — FD] — [4") — Y’)], 


where the expression on the right is the difference between the total 
increment of f on the interval [x’, x”] and the sum of its jumps on 
[x’, x”], 1.e., p(x”) — (x’) is the measure of the set of values taken by 
fat its continuity points in [x’, x”]. This quantity is clearly nonnegative, 
and hence ¢ is nondecreasing. Moreover, given any point x € [a, b], we 
have 


o(x — 0) = lim f(x — e) — lim U(x — c) = f(x — 0)— > A,, 
e-—0 £0 Yn <e 


Then 


e>0 e>0 
Oa 0) = Tim Geshe) atime Pe) = OC 0) x Ins 
e>0 e>0 sel 


and hence 
p(x + 0) — o(x — 0) = f(x + 0) —f(x — 0) —h=0, 


where h is the jump of ¥ at x. It follows that » is continuous at every 
point x € [a, b). ff 
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31.2. Differentiation of a monotonic function. The key result of this 
section (see Theorem 6 below) will be to show that a monotonic function f 
defined on an interval [a, b] has a finite derivative almost everywhere on [a, }]. 
Before proving this proposition, due to Lebesgue, we must first introduce 
some further definitions and then establish three preliminary lemmas. 

The derivative of a function f at a point x» is defined in the familiar way 
as the limit of the ratio 


X — Xo 


I(x) — (Xo) (3) 


as X-—» Xp. Even if this limit fails to exist, the following four quantities 
(which may take infinite values) always exist: 


1) The lower limit of (3) as x + x, from the left, denoted by 4,; 

2) The upper limit of (3) as x — x» from the left, denoted by A,;;? 
3) The lower limit of (3) as x > x, from the right, denoted by Ap; 
4) The upper limit of (3) as x —> xy from the right, denoted by Ap. 


These four quantities, with the geometric meaning shown in Figure 17, are 
called the derived numbers of f at xo.* It is clear that the inequalities 


Ar < Ay, Ap < Ar (4) 


always hold. If A, and A; exist and are equal, their common value is just 
the left-hand derivative of f at x. Similarly, if Ap and Ap exist and are 
equal, their common value is just the right-hand derivative of fat x». More- 
over, f has a derivative at x, if and only if all four derived numbers A;, A; 


FIGURE 17 


2 Upper and lower limits are defined on p. 111. 
* To distinguish these quantities further, we can call A; the /eft-hand lower derived number, 
Ap the right-hand upper derived number, and so on. 
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Ay and Ap exist and are equal at x9. Hence the italicized assertion at the 
beginning of this section can be restated as follows: For a monotonic function 
defined on an interval [a, b], the formula 


holds almost everywhere on [a, 5]. 


DEFINITION 5. Let f be a continuous function defined on an interval 
[a, b]. Then a point xy € [a, b] is said to be invisible from the right (with 
respect to f) if there is a point © such that x» < & < band f(x») < f(&), 
and invisible from the left if there is a point & such thata < & < xy and 


I (xo) < f(&). 


Example. In Figure 18, the points belonging to the intervals [a,, b,) and 
(az, b) are invisible from the right (interpret the word “‘invisible’’). 


LEMMA 1 (F. Riesz). The set of all points invisible from the right with 
respect to a function f continuous on [a, b] is the union of no more than 
countably many pairwise disjoint open intervals (a,, b,),* such that 


Sa) < f (by) (ee 1 20s): (5) 


Proof. If x» is invisible from the right with respect to f, then the 
same is true of any point sufficiently close to x9, by the continuity of /f- 
Hence the set of all points invisible from the right is an open set G. It 
follows from Theorem 6, p. 51 that G is the union of a finite or countable 
system of pairwise disjoint open intervals. Let (a,, b,) be one of these 
intervals, and suppose 


FQ) > fbx). (6) 


FiGure 18 


4 However, if a, = a (say), then in some cases (@;, 5,) should be replaced by the half- 
open interval [a,, 5,), as in Figure 18. This is permissible, since [a,, 5,) is open relative to 
[a, 5]. 
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Then there is an (interior) point x» € (a, 5,) such that f(x) > f(6,). 
Of the points x € (a, b,) such that f(x) = f(x»), let x* be the one with 
largest abscissa (x* may coincide with x9). Since x* belongs to (a, 5,) 
and hence is invisible from the right, there is a point § > x* such that 
f(&) > f(x*). Clearly & cannot belong to (a,, b,), since x* is the point 
x with largest abscissa for which f(x) = f(%), while f(5,) < f (Xo), so 
that & & (a;, b,) would imply the existence of a point x > x* such that 
f(x) =f(%9). On the other hand, the inequality § > b, is also im- 
possible, since it would imply f(5,) < f(%o) < f(&) despite the fact that 
b, is not invisible from the right. Thus (6) leads to a contradiction 
(obviously &  6,). It follows that f(a,) < f(b,). §j 


Lemma 1’. The set of all points invisible from the left with respect to 
a function f continuous on [a, b) is the union of no more than countably 
many pairwise disjoint open intervals (a,, b;,), such that 


fa) >fb) &=1,2,..). 
Proof. Virtually the same as that of Lemmal. j 
LemMA 2. Let f be a continuous nondecreasing function on [a, b], with 


Ay, and Ap as two of its derived numbers. Given any numbers c, C and e 
such that 


0<c<C< Oo, oe = 


let E, be the set 
E, = {x:Ay <c, Ap > C}. 
Then 
u{x:xE EO (a, B)} < p(B — a) 


for every open interval («, 8) © [a, 5]. 


Proof. Let xy be a point of (a, B) for which A; < c. Then there is a 
point § < x such that 
fOQ—Se0) — , 


& — Xo 


S(&) — ¢ > f (Xo) — cx. 


Therefore xX, is invisible from the left with respect to the function 
f(x) — cx. Hence, by Lemma 1’, the set of all such x, is the union of 
no more than countably many pairwise disjoint open intervals (a, 8,) < 
(x, 8), where 


1.e., such that 


SF (&,) — ca, > f (By) — cBy, 
FS (By) — f(a) < c(B, — @,). (7) 


or equivalently 


SEC. 31 DIFFERENTIATION OF THE INDEFINITE LEBESGUE INTEGRAL 321 


Let G, be the set of points in (a,, 8,) for which Ap > C. Then, by 
virtually the same argument together with Lemma 1, G, is the union of 
no more than countably many pairwise disjoint open intervals («,.,, 8;,). 
where 


B,., es cl (B..) — f (ax,,)] (8) 


(why ?). Clearly E, A (a, 8) is covered by the system of intervals («,,5 Bin): 
Moreover, it follows from (7) and (8) that 


© Gen — te.) < SEC) — LC 


<GEUG) ~S@)1< EE Gro) < GB—o). F 
We are now in a position to prove 


THEOREM 6 (Lebesgue). A monotonic function f defined on an interval 
[a, b] has a finite derivative almost everywhere on [a, b]. 


Proof. There is no loss of generality in assuming that f is non- 
decreasing, since iff is nonincreasing, then obviously —f is nondecreas- 
ing. But if —f has a derivative almost everywhere, then so does f. We 
also assume that fis continuous, dropping this restriction at the end of 
the proof. It will be enough to show that the two inequalities 


Ar < +0 (9) 
and 


A, > Ar (10) 


hold almost everywhere on [a, 5], for any continuous nondecreasing 
function. In fact, setting f*(x) = —f(—~), we see that f* is continuous 
and nondecreasing, like f itself. Moreover, it is easily verified that 


where AF and Aj are the indicated derived numbers of f*. Therefore, 
applying (10) to f*, we get 


At > Ap 
or 

Arp > Ap. (11) 
Combining the inequalities (10) and (11), we obtain 


Arn <Ar< Ap <Ap< Ap, 
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after using (4). Thus if (9) and (10) hold almost everywhere, we have® 
—0 <A; = Ay = Ap = Ap < +00 


almost everywhere, and the theorem is proved. 

To prove that Ap < +00 almost everywhere, we argue as follows: 
If Ap = +00 at some point x9, then, given any constant C > 0, there is 
a point & > x, such that 


SE) —f (%) SG: 


S— Xo 
F(&) — f(%o) > CE — Xp), 
f() — CS > F(X) — Cxp. 


Thus Xq is invisible from the right with respect to the function f(x) — Cx. 
Hence, by Lemma 1], the set of all points x, at which Ag = + 00 is the 
union of no more than countably many open intervals (a,, 5,), whose 
end points satisfy the inequalities 


f(a) — Ca, < f(b;) — Ch, 


f(y) — f(a) > Cb, — a). 


Dividing by C and summing over all the intervals (a,, b,), we get 


1.€., 


or equivalently 


or 


k 


But C'can be made arbitrarily large. Hence the set of points where Ap = 
+ oo can be covered by a collection of intervals the sum of whose lengths 
is arbitrarily small. It follows that this set is of measure zero, i.e., that 
Ar < +00 almost everywhere. 

To prove that A; > Ag almost everywhere, let the numbers c, C, 
e and the set E, be the same as in Lemma 2. It will then follow that 
Az, > Ap almost everywhere if we succeed in showing that u(E£,) = 0, 
since the set of points where A; < A, can clearly be represented as the 
union of no more than countably many sets of the form E, (why ?). 
Let u(Z,) = t. Then, given any « > 0, there is an open set G, equal 
to the union of no more than countably many open intervals (a,, 5,) 
such that EF, © G and 


I 


> Note that Ag cannot equal —©o, since the difference quotient (3) is inherently non- 
negative if fis nondecreasing. 
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(this follows from the very definition of Lebesgue measure on the line). 
If 


= L[E, OV (ay, 5;)); 


t= > t, 
k 
But ¢t, < e(b, — a,), by Lemma 2. Hence 
t< pd (b,— ay) < elt +8), 
k 


then 


which implies t < et, since « > 0 is arbitrary. This in turn implies 
t= 0, since 0<e <1. Therefore A; > Ap almost everywhere, as 
asserted. 

Finally, to drop the requirement that f be continuous, we need only 
generalize Lemmas I| and 1’ in the way indicated in Problem 6, noting that 
the proof continues to go through (check details).6 J 


Remark. Despite its apparent complexity, the proof of Theorem 6 is 
based on simple intuitive ideas. For example, the finiteness of Ap (and A,) 
almost everywhere is easily made plausible. In fact, let f be continuous and 
nondecreasing on [a, b]. Then f maps [a, 5] into the interval [ f(a), f(6)], at 
the same time subjecting a small interval [x, &] at x to a “magnification” 
approximately equal to 

ya) = LO=f@) 

E—x 
But the interval [ f(a), f(6)] is finite, and hence y(x) cannot be infinite on a 
set of positive measure. As for the part of the proof based on Lemma 2, 
it merely says that if the intersection of a subset A < [a, b] with every interval 
(a, 8) has measure no greater than e({8 — «) for some fixed number o < 1, 
then A cannot have positive measure. 


31.3. Differentiation of anintegral with respect to its upper limit. Returning 
to the problem of differentiating the indefinite Lebesgue integral, we have 


THEOREM 7. Let f be any function summable on [a, b]. Then 


d fx 
«J fat (12) 


exists and is finite for almost all x. 
Proof. As noted at the beginning of Sec. 31.1 
fO=fL.0 —LO, 


6 For an alternative proof, see Problems 7-9. 
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where f, and f_ are nonnegative summable functions, so that 


F(x) = [* fo dt = [* fat — [Lat = Fe) — Fe) 


is the difference between two nondecreasing functions F, and F,. But F, 
and F, have finite derivatives almost everywhere, by Theorem 6, and 
hence sodoes F. § 


We now evaluate the derivative (12), thereby giving an affirmative answer 
to the first of the two questions posed on p. 314: 


THEOREM 8. Let f be any function summable on [a, b]. Then 


d fe 
es t)hdt = f(x 
~ [FO at =f) 
almost everywhere. 
Proof. Let 


F(x) = ih f(t) dt. 


Then it will be enough to show that 
f(x) > FG) (13) 


almost everywhere for any summable function. In fact, changing f(x) 
to —f(x) in (13), we get 
Sf) > —F'(x) 
f (x) < F's). (14) 
But (13) and (14) together imply the desired result 


and hence 


— F(x) = 4 f? 
fa) = FR) == [Psat 


(almost everywhere). 
To prove (13), we observe that if 


f(x) < F'(x), 
then there are rational numbers « and 8 such that 
fQ@<a<8 < FR). (15) 


Let E,, be the set of all x satisfying (15). Then, as we now show, 
u(E.g) = 0. Since the number of sets E,, is countable, this will imply 


u{xif (x) < F'(x)} = 0 
and hence that (13) holds almost everywhere. 


SEC. 31 DIFFERENTIATION OF THE INDEFINITE LEBESGUE INTEGRAL 325 


To prove that u(£,,) = 0, we first note that, given any « > 0, there is 
a 8 > 0 such that w(£) < 8 implies 


<é 


| [. f(t) dt 


(the existence of such a number 8 follows from the absolute continuity 
of the Lebesgue integral, proved in Theorem 6, p. 300).? Let G < [a, 5] 
be an open set, made up of no more than countably many pairwise 
disjoint open intervals (a,, b,), such that 


EycG,  u(G)<pEg) +8, 
and let x» be any point in G, = Ey, M (a, b,). Then 
F(§) — F(X) = 
& — Xo 
for any point & > x, sufficiently close to x). Writing (16) in the form 
F(S) — BE > F(x) — BXo: 


we see that the point x, is invisible from the right with respect to the 
continuous function F(x) — $x. It follows from Lemma 1 that G, is 
the union of no more than countably many pairwise disjoint open 
intervals (a, _, 5,,), where 


F(a,,) — Ba, < Fi (O,,) + Bex; 


F(b,) — F(a) > BO. — 4, 


B (16) 


1.€., 


or equivalently 


[Ps dt > BO, — 4,): (7) 
If " 
S=UG,,, bi), 
kn 


then clearly 


Ege SG,  y(S)<p(Ey) +8. 


Summing (17) over all the intervals (a, , b,.), we get 


[0 dt = [Pf dt > BY On, — an) = GuS) 


* In particular, F(x) is continuous. In fact, 


|F(x’) — FO)| = | fF feat <é 


if |x’ — x| < 8. 
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On the other hand, 


J,noae=[, soar), fod 
< au(E,g) +e < a(S) + |a| 8+ «. (18) 
Comparing (17) and (18), we get 
a(S) + |a| d+ ¢ > Bu(S) 


Or 


w(s) < “S°— 
B 
Therefore £,, is contained in an open set of arbitrarily small measure (it 
can be assumed that |a| 8 < ¢). It follows that u(E,,) =0. J 


lal S+e 
—- ¢ : 


Problem 1. Let x,, Xo,...,X,,-.. be the set of all rational points in 
[a,b], enumerated in any way, and let h, = 1/2”. Prove that the jump 
function 


f™)= 2A, 


Ln <2 
is discontinuous at every rational point and continuous at every irrational 
point. 
Problem 2. Suppose we define a jump function by the formula 


fX) = 2 An (19) 


Ln<x 


rather than by the formula (2). Prove that f is continuous from the right, 
rather than from the left as in Theorem 4. 


Problem 3. Find the derived numbers of the function 
x sin E if x > 0, 
I(x) = x 


0 if x<0 
at the point x = 0. 


Problem 4. Find the points invisible from the left in Figure 18, p. 319. 
Problem 5. In Lemma 1, show that f(a,) = f(6,) if a, 4 a. 


Problem 6. Prove that the requirement that f be continuous on [a, b] can 
be dropped in Lemma 1, provided that 


1) The discontinuity points of fare all of the first kind; 
2) A point xX € [a, b] is said to be invisible from the right (with respect 
to f) if there is a point & such that x» < § < band 


max {f(Xo = 0), f (Xo). f (Xo oe 0)} <f (8); 
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3) The inequality (5) is replaced by 
f(a + 0) < max { f(b, — 9), f(b,), f (O;, + 9)}. 


State and prove the corresponding generalization of Lemma 1’. 


Problem 7. Let 60 
> Pnlx) =f) (20) 
be an everywhere convergent series, whose general term ¢,,(x) is nondecreasing 


(alternatively, nonincreasing) on [a, 6]. Prove that (20) can be differentiated 
term by term almost everywhere, i.e., that 


> Pn(x) = f(x) 
almost everywhere. jas 
Problem 8. Prove that every jump function has a zero derivative almost 


everywhere. 
Hint, Use Problem 7. 


Problem 9. Prove that the assumption that f be continuous from the left 
in Theorem 5 can be dropped if we define a jump function as a sum of a 
“left yump function” like (2) and a “right jump function” like (19). Use 
this fact and Problem 8 to complete the proof of Theorem 6 without recourse 
to Problem 6. 

Hint, Use Problem 8 and Theorem 5S. 


Problem 10. Following van der Waerden, let 


x f O<x< i, 
Po(x) = 1 


1—x if dé<cxe 


? 


and continue ¢, by periodicity, with period 1, over the whole x-axis. Then 
let 


J n 
ae oo" xX) (n=1,2,...), 


FO) = ¥ onl) 


Pn(X) = 


Prove that 
a) The function fis continuous everywhere; 
b) The derivative of f fails to exist at every point x) € (— 00, 00). 


Hint. Consider the increments 
1 
f Xe = 4n — f (X) 
gh 


+P 
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32. Functions of Bounded Variation 


The problem of differentiating a Lebesgue integral with respect to its 
upper limit has led us to consider functions that can be represented as 
differences between two monotonic functions. We now give a different 
description of such functions (independent of the notion of monotonicity), 
afterwards studying some of their properties. 


DEFINITION 1. A function f defined on an interval [a, b] is said to be 
of bounded variation if there is a constant C > 0 such that 


Sifew fora < (1) 


for every partition 
2=X%<xX<°°' <x, = 5 (2) 


of [a, b] by points of subdivision Xo, X1,... 5 Xn: 


Example. Every monotonic function is of bounded variation, since the 
left-hand side of (1) equals | f(b) — f(a) regardless of the choice of partition. 


DEFINITION 2. Let f be a function of bounded variation. Then by the 
total variation of f on [a, b], denoted by V°(f), is meant the quantity 


V°(f) = sup > lf) —f% DI, (3) 
k=l 
where the least upper bound is taken over all (finite) partitions (2) of the 


interval [a, b]. 


Remark I. A function f defined on the whole real line (— 00, oo) is said 
to be of bounded variation if there is a constant C > 0 such that 


vVN<c 


for every pair of real numbers a and b (a < db). The quantity 


is then called the total variation of f on (— 00, 00), denoted by V2,.(/). 


Remark 2. Jt is an immediate consequence of (3) that 


Valaf) = lal Va(S) (4) 
for any constant «. 
THEOREM 1. If f and g are functions of bounded variation on [a, }], 
then so is f + g and 
Valf + 8) < Va(f) + Vals). (5) 
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Proof. For any partition of the interval [a, b], we have 
> (FO) + 8%) — fOr) — 80-1) 
k 
< >If) —fOea)] + x |g(x,) — e(%-I. 
k 
Taking the least upper bound of both sides over all partitions of 
[a, b], and noting that 
sup {x + y:xe€A, ye B} < sup {x:x € A} 4+ sup {y:y € B}, 
we immediately get (5). § 


It follows from (4) and (5) that any linear combination of functions of 
bounded variation is itself a function of bounded variation. In other woras, 
the set of all functions of bounded variation on a given interval is a linear 
space (unlike the set of all monotonic functions). 


THEOREM 2. Ifa <b <c, then 


Vas) = Vat) + Vif). (6) 


Proof. First we consider a partition of the interval [a, c] such that 
b is one of the points of subdivision, say x, = 6. Then 


Dh (X;,) — f(%p-1)I 


= ZI fOs) Se + > On SO < VIN+ VI) 


Now consider an arbitrary partition of [a, c]. It is clear that adding an 
extra point of subdivision to this partition can never decrease the sum 


2A (x1) — fra. 
Therefore (7) holds for any subdivision of [a, c], and hence 


Vat) < Vaf) + Vols). (8) 


On the other hand, given any « > 0, there are partitions of the intervals 
[a, b] and [b, c], respectively, such that 


€ 


D3 fxd —fOgal > Vas) = 5 ) 


DNF) — fA > VA) — >: 
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Combining all points of subdivision x;, x,, we get a partition of the 
interval [a, c], with points of subdivision x,, such that 


x If) — fn)! = p2 Ifa) — fa) + py IF) — £51) 
> Vf) + Vi(f) —e. 
Since « > 0 is arbitrary, it follows that 
Vat) > Vat) + Vi). (9) 
Comparing (8) and (9), we get (6). fj 


COROLLARY. The function 


(x) = Valf) (10) 


is nondecreasing. 


Proof. An immediate consequence of (6), since the total variation of 
any function of bounded variation on any interval is nonnegative. [fj 


THEOREM 3. Let f be a function of bounded variation on [a, b], and let 
v be the function (10). Then if fis continuous from the left at a point x*, 
SO iS Vv. 


Proof. Given any ¢ > 0, use the fact that fis continuous from the left 
to choose a § > 0 such that 


LF") — FOO <5 (11) 
whenever x* — x < 8. Then choose a partition 
a= xp Xk Ss eS x 
such that 
Ve (A) ~ Xf a) — fw <5. (12) 


Here it can be assumed that 

as Xn << 5, 
since otherwise we need only add an extra point of subdivision which can 
never increase the left-hand side of (12). It follows from (11) and (12) 
that 

n—-1 

Ves) > fn) — fOr <¢, 

and hence 


V"(f) — Viet) <e 


v(x*) — v(x,-1) < €. 


a fortiori, 1.e., 
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But then, since v is nondecreasing, 
v(x*) — v(x) <e 
for all x such that x,_, < x < x*. In other words, v is continuous from 
the left atx*, § 
Remark. Virtually the same argument shows that if fis continuous from 


the right at x*, then so is v. Together with Theorem 3, this shows that if 
fis continuous at x*, or on the whole interval [a, b], then so is v. 


THEOREM 4. If f is of bounded variation on [a, b], then f can be rep- 
resented as the difference between two nondecreasing functions on [a, b}. 


Proof. Let 
u(x) = Vif), 


and consider the function 
FSU. 
Then g is nondecreasing. In fact, if x’ < x”, then 


g(x") — g(x’) = [o(%") — ox] — LA") — f(x). (13) 


|") —F(%)| < v(x") — v(x’), 
by the very definition of v, and hence the right-hand side of (13) is 
nonnegative. Writing 


But 


faU= sz; 
we get the desired representation of f as the difference between two 
nondecreasing functions. J 


CoroLLary 1. Every function of bounded variation has a finite derivative 
almost everywhere. 


Proof. An immediate consequence of Theorem 6, p. 321. JJ 


Coro.iary 2. If f is summable on [a, b], then the indefinite integral 


D(x) = [fat 
is a function of bounded variation on [a, 6}. 
Proof. Recall the remarks at the beginning of Sec. 9.1. fj 
Problem 1. Prove that V(f) = 0 if and only if f(x) = const on [a, 5]. 
Problem 2. Prove that the function 
x" sin = it Or xh 
f{~)= x 

0 if x=0 
is of bounded variation on [0, 1] if « > 6 but not if « < 8B. 


332. DIFFERENTIATION CHAP. 9 


Problem 3. Suppose f has a bounded derivative on [a, b], so that f’(x) 
exists and satisfies an inequality | f’(x)| < C at every point x € [a, b]. Prove 
that fis of bounded variation and 


Vat) < C(b — a). 


Problem 4. Prove that if f and g are functions of bounded variation on 
[a, b], then so is fg and 


Vicfg) < ViCf) sup le(x)| + V2(g) sup [f(x)]. 


Problem 5, Let f be a function of bounded variation on [a, b] such that 
I(x) >c>0. 


Prove that 1/fis also a function of bounded variation and 
1 1 
Ve(-=) <= VY). 
(; ard) 


Problem 6. Prove the converse of Theorem 4. 


Problem 7. Prove that a curve 
y=f) (@<x<b) 


is rectifiable, i.e., has finite length, as defined in Problem 3, p. 114, if and 
only if fis of bounded variation on [a, 5]. 


Problem 8. Let f be a function of bounded variation on [a, b]. Prove that 


fl = Va) 
has all the properties of a norm (cf. p. 138) if we impose the extra condition 
f(a) = 09. 


Comment. Thus the space V?, ,, of all functions of bounded variation 
on [a,b] equipped with this norm and vanishing at x =a is a normed 
linear space (addition of functions and multiplication of functions by 
numbers being defined in the usual way). 


Problem 9. Prove that the space V”, ,, defined in the preceding comment 
is complete. 


Problem 10. Does there exist a continuous function which is not of 
bounded variation on any interval? 


Hint. Recall Problem 10, p. 327 and Corollary 1 above. 
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33. Reconstruction of a Function from Its Derivative 


33.1. Statement of the problem. We now address ourselves to the second 
of the problems posed on p. 314, i.e., we look for the largest class of functions 
F such that 


| *P'(t) dt = F(x) — F(a), (1) 


or equivalently 


F(x) = F(a) + [ "F'(t) dt. (2) 


(As we know from calculus, these formulas hold if F is continuously differ- 
entiable.) From the outset, we must restrict ourselves to functions F which 
are differentiable (i.e., have a finite derivative) almost everywhere, since 
otherwise (2) would be meaningless. Every function of bounded variation 
has this property (see Corollary 1, p. 331). Moreover, the right-hand side of 
(2) is a function of bounded variation (see Corollary 2, p. 331). It follows 
that the largest class of functions satisfying (2) must be some subset of the 
class of functions of bounded variation. Since every function of bounded 
variation is the difference between two nondecreasing functions (see Theorem 
4, p. 331), we begin by studying nondecreasing functions from the standpoint 
of formula (1). 


THEOREM 1. Let F be a nondecreasing function on [a,b]. Then the 
derivative F’ is summable on [a, b] and 


[PF dt < F(b) — F(a). (3) 
Proof. Let 


,() = nl F(# 4 =) = F(O| ee oe 


where, to make ®,(t) meaningful for all ¢ € [a, b], we get F(t) = F(bd) 
forb<t<b+1, by definition.® Clearly 


1) 
F'(1) ak 7 ) F() 


nO 


= lim ®,(2) 


¥ (oon? 8) 


= ol 


almost everywhere on [a, 5]. Since Fis summable on [a, b], by Theorem 


® Verify that this does not affect the validity of the proof. 
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1, p. 316, so is every D,. Integrating D,, we get 


b b 1 b+(1/n) b 
D t = =) — = — 
['o,@at= nf [F(e+ 4 F(O| dt n| [rm raar— fi F(dat| 
_ n| frre dt— [F(a at| < F(b) — F(a), 
where in the last step we use the fact that F is nondecreasing. The 
summability of F’ and the inequality (3) now follow at once from Fatou’s 


theorem (Theorem 3, p. 307). Jj 


Example 1. It is easy to find nondecreasing functions F for which (3) 
becomes a strict inequality, i.e., such that 


Pro dt < F(b) — F(a). (4) 
For example, let 
0 if O<t< }, 
F(t) = 
| if $<t<l. 


Then 
0 = pro dt < F(1) — F(0) = 1. 


Example 2 (The Cantor function). In the preceding example, F is discontin- 
uous. However, it is also possible to find continuous nondecreasing functions 
satisfying the strict inequality (4). To this end, let 


[ay”’, DY] = 3, 3] 
be the middle third of the interval [0, 1], let 
fay, OP] = [5,3], las’, bP] = 18 


be the middle thirds of the intervals remaining after deleting [a%, b!] from 
[O, 1], let 


[a,b] = [eS], fa, bP] = Le, Sa, 
[a?), oD? ] = RSS], Lal, bP] = BB, 38] 


be the middle thirds of the intervals remaining after deleting [a’, b?], 
[a,b] and [a?b®] from [0, 1], and so on, with 


(ae bl cn lab Wake ose Bae] 


being the 2” intervals deleted at the nth stage. Note that the complement of 
union of all the intervals [a‘”’, b(”’] . is the set oi all “points of the second 
kind” of the Cantor set constructed in Example 4, p. 52, i.e., all points of the 
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Cantor set except the end points 
0,1,3, 5,959 5 39° -- (5) 


of the deleted intervals (together with the points 0 and 1). 
Now define a function 


2k—-1, | 
F(t) = —— if te fal”, bi”, 
so that 
F@)= 4+ if <tc §, 
- tf. pare 5, 
F(t) = 
$ if $ <t<§, 
, if a<t<s, 
F(t) + it 4<t< Fy, 
s if 2<t<#, 
, «of Htc H, 


and so on, as shown scheinatically in Figure 19. Then F is defined everywhere 
on [0, 1] except at points of the second kind of the Cantor set. Given any 
such point ¢*, let {t,} be an increasing sequence of points of the type (5) 
converging to ¢*, and let {t/} be a decreasing sequence of points of the same 
type converging to t* (why do such sequences exist ?). Then let 


F(t*) = lim F(t,,) = lim F(t;,) 
(justify the equality of the limits). Completing the definition of F in this way, 
we obtain a continuous nondecreasing function on the whole interval [0, 1], 
known as the Cantor function. (Fill in some missing details.) The derivative 
F’ obviously vanishes at every interior point of the intervals [a‘”, b(”], and 


FiGuReE 19 
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hence vanishes almost everywhere, since the sum of the lengths of these 
intervals equals 
b+etate al 
(the Cantor set is of measure zero). It follows that 
0= : F(t) dt < F(1) — F(0) = 1. 


33.2. Absolutely continuous functions. We have just given examples of 
functions for which formula (1) does not hold. To describe the class of 
functions satisfying (1), or equivalently (2), we will need the following 


DEFINITION. A function f defined on an interval [a, b] is said to be 
absolutely continuous on [a, bj if, given any ¢ > 0, thereisa 8 > 0 such that 


2 |For) — f(a)| <e 
for every finite system of pairwise disjoint subintervals 


(a,, b,) < [a, b] (A =1,...,n) 
of total length 


> (by, — ay) 
k=1 


less than 8. 


Remark 1. Clearly every absolutely continuous function is uniformly 
continuous, as we see by choosing a single subinterval (a,, 5,) © [a, 5}. 
However, a uniformly continuous function need not be absolutely continuous. 
For example, the Cantor function F constructed in Example 2 of the preceding 
section is continuous (and hence uniformly continuous) on [0, 1], but not 
absolutely continuous on [0, 1]. In fact, the Cantor set can be covered by a 
finite system of subintervals (a,, b,) of arbitrarily small total length (why ?). 
But obviously 


> F(b,) — F(a,)| = 1 


for every such system. The same example shows that a function of bounded 
variation need not be absolutely continuous. On the other hand, an absolutely 
continuous function is necessarily of bounded variation (see Theorem 2). 


Remark 2. In the definition, we can change “‘finite” to “‘finite or count- 
able.” In fact, suppose that given any < > 0, there is a 5 > 0 such that 


Sifbd —flayl <e' <s 


for every finite system of pairwise disjoint intervals (a,, b,) | [a, b] of total 
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length less than 5, and consider any countable system of pairwise disjoint 
intervals (a,, 8,) < [a, b] of total length less than 3. Then obviously 


SFG) Seal < 


for every n. Hence, taking the limit as n — oo, we get 


SIG — f(%)| <2’ <e 


THEOREM 2. If fis absolutely continuous on [a, b], then f is of bounded 
variation on [a, b]. 


Proof. Given any « > 0, there is a 3 > 0 such that 


> |For) — f(a) <¢ 

k= 

for every system of pairwise disjoint intervals (a,, b,) © [a, b] such that 
k= 


Hence if [«, 8] is any interval of length less than 5, we have 


VES) <e 


Let “ 
A=X <x << xXyV =| 


be a partition of [a, b] into N subintervals [x, ,, x,] all of length less 
than 6. Then, by Theorem 2, p. 329, 


Vif) < Nexo. § 


THEOREM 3. If fis absolutely continuous on [a, b], then so is af, where 
a is any constant. Moreover, if f and g are absolutely continuous on [a, 5}, 


then so is f + g. 


Proof. An immediate consequence of the definition of absolute con- 
tinuity and obvious properties of the absolute value. ff 


It follows from Theorems 2 and 3 (together with Remark 1) that the set 
of all absolutely continuous functions on [a, b] is a proper subspace of the 
linear space of all functions of bounded variation on [a, dD]. 


THEOREM 4. If fis absolutely continuous on [a, b}, then f can be repre- 
sented as the difference between two absolutely continuous nondecreasing 
functions on [a, 5]. 
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Proof. By Theorem 2, fis of bounded variation on [a, b], and hence 
can be represented in the form 


feu -s. 
where 


W(xy=VIY), s=ao—-f 
are the same nondecreasing functions as in Theorem 4, p. 331. We now 


verify that v and g are absolutely continuous. Given any < > 0, let 5 > 0 
be such that 


S16) — fla) <e' <s 


for every finite system of pairwise disjoint subintervals (a,, b,) © [a, b] 
of total length less than 5. Consider the sum 


» [v(B,) — o(a,)| => [0(b,) — o(a,), 


equal to the least upper bound of the sums « - 


3 > fe.) — fer! (6) 


taken over all possible finite partitions 


ee e@«© @® @e@ @® @ @ @ @ @ @® @ @ @ @ ee @ oe @# @ @ 


Ay = Xpo <Xe1 < < Xg,m, = Ons 
an = Xn,0 < Xn, ae ee Xn,mn Di 
of the intervals (a,, b,),..., (a,, 0,). The total length of all the intervals 


(X_1-1> Xx,1) figuring in (6) is clearly less than 5, and hence the sum (6) is 
less than e’, by the absolute continuity of f. Therefore 


2, le{b,) — o(a)] < € <, 


i.e., v is absolutely continuous on [a, 5]. It follows from Theorem 3 
that g = v — fis also absolutely continuous on [a, b]. 


We now study the close connection between absolute continuity and the 
indefinite Lebesgue integral: 


THEOREM 5. The indefinite integral 


F(x) = [*f( dt 


of a summable function f is absolutely continuous, 
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Proof. Given any finite collection of pairwise disjoint intervals 
(a,, b,), we have 


Ee a => ao) dt = a [f(t)| at. 


k 


> Fb.) — F(ax)| = 2, 

k=1 k=1 
But the last expression on the right approaches zero as the total length 
of the intervals (a,, b,) approaches zero, by the absolute continuity of 
the Lebesgue integral (Theorem 6, p. 300). & 


Lemma. Let f be an absolutely continuous nondecreasing function on 
[a, b] such that f’(x) = 0 almost everywhere. Then f(x) = const. 


Proof. Since fis continuous and nondecreasing, its range is the closed 
interval [ f(a), f(b)]. We will show that the length of this interval is zero 
if f’(x) = 0 almost everywhere, thereby proving the lemma. Let E be 
the set of points x € [a, b] such that f"(x) = 0, and let Z = [a, b] — E, 
where u.(Z) = 0, by hypothesis. Given any « > 0, we find 8 > 0 such 
that 


> f(x) — flay)| <e (7) 


for any finite or countable system of pairwise disjoint intervals (a;, b,) < 
[a, b] of length less than 3 (recall Remark 2, p. 336), and then cover Z 
by an open set of measure less than 8 (this is possible, since Z is of measure 
zero). In other words, we cover Z by a finite or countable system of 
intervals (a,, b,) of total length less than 5. It then follows from (7) that 
the whole system of intervals, and hence (a fortiori) the set 


z-WU (ay, by), 
k 


is mapped into a set of measure less than ¢. But then u[/(Z)] = 0, 
since ¢ > 0 is arbitrary. 
Next consider the set E = [a, b] — Z, and let x,¢ E. Then, since 
Sf’ (%o) = 9, we have 
I(x) — f(%o) Be 
x 7 Xo 
for all x > x, sufficiently near Xp, 1.e., 


I(x) — f (Xo) < E(x — Xo) 
EXy — f (Xo) < ex — f(x). 


Therefore the point x, 1s invisible from the right with respect to the 
function ex — f(x). It follows from Lemma 1, p. 319 that EF is the 
union of no more than countably many pairwise disjoint intervals («,, 8,), 


or 
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with end points satisfying the inequalities 


ea, — f (%) < eB, — f(B,) 
S (Bx) — flex) < e(By — &). 


OT 


But then . 
> (F(Bx) — f(%.)] < © > (Be — %) < e(b — a). 
k k 
In other words, f maps E into a set covered by a system of intervals of 
total length less than c(b — a). Therefore u[f(E)] = 0, since « > 0 
is arbitrary. 

We have just shown that the sets {(Z) and f(£) are both of measure 
zero. But the interval [ f(a), f(b)] is the union of f(Z) and f(£). It 
follows that [ f(a), f()] is of length zero, i.e., that f(x) = const. Jj 


We are now in a position to prove 
THEOREM 6 (Lebesgue). If F is absolutely continuous on [a, b], then 
the derivative F’ is summable on [a, b]| and 
F(x) = F(a) + ib F'(t) dt. (8) 
Proof. We need only consider the case of nondecreasing F (why ?). 


Then F’ is summable, by Theorem 1, and the function 
@(x) = F(x) — i) * F(t) dt (9) 
is also nondecreasing. In fact, if x” > x’, then 
O(x") — O(x") = F(x") — F(x’) — [FD at > 0, 


where we again use Theorem 1. Moreover, ® is absolutely continuous, 

being the difference between two absolutely continuous functions (recall 

Theorems 3 and 5), and ®’(x) = 0 almost everywhere, by Theorem 8, 

p. 324. It follows from the lemma that ®(x) = const. Setting x = a, 

we find that this constant equals F(a). Replacing ®(x) by F(@) in (9), 

we get (8). 

Remark, Combining Theorems 5 and 6, we can now give a definitive 

answer to the second of the questions posed on p. 314 (see also p. 333): 
The formula 


[[F'@ dt = FQ) — F@), 


or equivalently, 
F(x) = F(a) + [°F dt, 


holds for all x € [a, b] if and only if F is absolutely continuous on [a, 5]. 
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33.3. The Lebesgue decomposition. Let f be a function of bounded varia- 
tion on [a, b]. Then it follows from Theorem 4, p. 331 and Problem 9, p. 327 
that f can (in general) be represented as a sum 


f(x) = 9%) + ¥@), (10) 
where ¢ is a continuous function of bounded variation and is a jump 
function.? Now let 


9,(x) = xO dt, (11) 


a(x) = p(x) — oi(x). 
Then ¢, is absolutely continuous, while ¢, is a continuous function of bounded 
variation such that 


Pax) = 9'(x) — £ ow dt=0 
x 


almost everywhere. A continuous function of bounded variation is said to 
be singular if its derivative vanishes almost everywhere. For example, the 
Cantor function F constructed in Example 2, p. 334 is singular. Combining 
(10) and (11), we find that a function f of bounded variation can (in general) 
be represented as a sum 


F(x) = oi) + a) + YX) (12) 
of an absolutely continuous function 9), a singular function ~, and a jump 
function ). Formula (12) is known as the Lebesgue decomposition. 


Remark. Differentiating (12), we get 
LO) = 91%) 


almost everywhere. Thus integration of the derivative of a function of 
bounded variation does not restore the function itself, but only its absolutely 
continuous “‘component,”’ while the other two components, i.e., the singular 
function and the jump function, “disappear without a trace.” 


Problem 1. Prove that a function f is absolutely continuous on [a, b] if 
and only if it is a continuous function of bounded variation mapping every 
subset Z <[a, b] of measure zero into a set of measure zero. 


®° Generalizing Problem 9, p. 327, by a jump function, we now mean a function of 


the form 
5 it. Ye 
<2 Che 
where the numbers ,...,4n,... and hj, ..., hh, ... corresponding to the discon- 
tinuity points x;,...,%n,...andx,,...,%n,... Satisfy the conditions 


Yitd< 0, Darl < © 
n n 


(we now allow negative hn, hn). 
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Problem 2. Verify directly from the definition on p. 336 that the function 


x sin if x #0, 
I(x) = x 
0 if x=0 


fails to be absolutely continuous on any interval [a, b] containing the point 
x = 0. 


Problem 3. Prove that if a function f satisfies a Lipschitz condition 


|) — fe") < K xt — x" 


for all x’, x” € [a, b], then f is absolutely continuous on [a, 5]. 


Problem 4. Prove that each of the terms 9, 9, and in the Lebesgue 
decomposition (12) is unique to within an additive constant. 


Comment. The stipulation “‘to within an additive constant’? can be 
dropped if we require the function f and its ““components”’ to vanish at x = a, 
say, or if we agree to regard all functions differing by a constant as equivalent. 


Problem 5. Let A?, ,, be the space of all absolutely continuous functions 
f defined on [a, 5], satisfying the condition f(a) = 0. Prove that A?, ,, is 
a closed subspace of the space V?, ,, of all functions of bounded variation 


on [a, b] satisfying the same condition, equipped with the norm||f || = V2(f). 


Comment. There is no need for the condition f(a) = 0 if we regard all 
functions differing by a constant as equivalent. We then have || f\| = 0 if 
and only if f = const. 


Problem 6. Starting from a locally summable function f, i.e., a function 
summable on every finite interval, defined the corresponding generalized 
function f and generalized derivative f’ by the formulas 


(9) = (es (x)p(x) dx, 


fe) = —[° fode'x) dx 


as in Sec. 21.2. (Here ¢ is any test function, i.e., any infinitely differentiable 
function of finite support.) Prove that the generalized derivative f’ determines 
f to within an additive constant. Apply this to the case of the function 


0 if x <0, 
f(x) = 5 F() if O<x< il, 
] if x >1, 


where F is the Cantor function constructed in Example 2, p. 334. 


SEC. 34 THE LEBESGUE INTEGRAL AS A SET FUNCTION 343 


Hint. See Theorem 1, p. 213. 


Problem 7. Let f and f’ be the same as in the preceding problem, and 
suppose f is of bounded variation on (—o0, 0). Then f has an ordinary 
derivative almost everywhere. Let f, be the generalized function corre- 
sponding to df/dx, so that 


o d 
(fi p= os a o(x) dx. 
Prove that 


a) In general, f, does not equal the generalized derivative /’; 

b) If fis absolutely continuous, then f, = /’; 

c) If f, =/’, then fis equivalent to an absolutely continuous function’® 
and, in particular, is absolutely continuous if it is continuous. 


Hint. In a), consider the function 


fs) 0 if x<0O, 
x)= 
1 if x>0. 

Comment. Problems 6 and 7 further illustrate the situation discussed 
on pp. 206-207. To carry out the operations of analysis (in this case, recon- 
struction of a function from its derivative), we can either restrict the class of 
admissible functions (by requiring them to be absolutely continuous) or else 
extend the notion of function itself (at the same time, extending the notion 
of a derivative). 


34. The Lebesgue Integral as a Set Function 


34.1, Charges. The Hahn and Jordan decompositions. As we now show, 
the theory developed in Secs. 31-33 for functions defined on the real line 
(— oo, 00) continues to make sense in a much more general setting. Let X 
be a space (i.e., some “master set’) equipped with a measure u, and let f 
be a yu-summable function defined on X. Then f is summable on every 
measurable subset E ¢ X, so that the integral 


OE) = | @) dp (1) 


(for fixed f) defines a set function on the system -%, of all u-measurable 
subsets of X. By Theorem 4, p. 298, ® is o-additive, i.e., if a measurable 
set E is a finite or countable union 


ES], 


*° T.e., coincides almost everywhere with an absolutely continuous function. 
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of pairwise disjoint measurable sets E,,, then 
D(E) = > DE,). 


In other words, the set function (1) has all the properties of a o-additive 
measure except that it may not be nonnegative in the case where f takes 
negative values. These considerations suggest 


DEFINITION 1. A o-additive set function ® defined on a o-ring (in 
particular, a o-algebra) of subsets of a space X and in general taking 
values of both signs is called a signed measure or charge (on X). 


Remark. Thus the notion of a measure is equivalent to that of a non- 
negative charge. 


In the case of electrical charge distributed on a surface, we can divide 
the surface into two regions, one carrying positive charge (i.e., such that 
every part of the region is positively charged) and one carrying negative 
charge. We will establish the mathematical equivalent of this fact in a 
moment, after first introducing 


DEFINITION 2. Let © be a charge defined on a c-algebra FS of subsets 
of a space X. Thena set A — X is said to be negative with respect to D 
ifE OAE SS and OE OA) < 0 for every Ee S. Similarly, A is said 
to be positive with respect to D ifENA€e SF and DEO A)>0 for 
every Ee S. 


THEOREM 1. Given a charge © ona space X, there is a measurable set 
A~ < X such that A~ is negative and At = X — A” is positive with 
respect to D. 


Proof. Let 
a = inf (4), 


where the greatest lower bound is taken over all measurable negative 
sets A. Let {A,} be a sequence of measurable negative sets such that 


lim O(A,,) = A. 
Then _ 
A-=UA, 


is a measurable negative set such that 
O(A-) =a 
(why ?). To show that A™ is the required set, we must now prove that 


At = X — A™is positive. Suppose A* is not positive. Then At contains 
a measurable subset B, such that D(B,) < 0. However, B, cannot be 
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negative, since if it were, the set 4 = A~ U B, would be a negative set 
such that ®(4) < a, which is impossible. Hence there is a Jeast positive 
integer k, such that By contains a subset B, satisfying the condition 
0(B,) > Z 
1) + ky . 

Obviously B, ~ By. Applying the same argument to the set By — B,, 
we find a least positive integer k, such that By — B, contains a subset B, 
satisfying the inequality 

1 

O(B,) > — (ky > ky) 

ke 
(explain why k, > k,), a least positive integer 3 such that (By — By) — By 
contains a subset B; satisfying the inequality 


1 
O(B3) > k. (kg > ka), 
3 
and so on. Now let 
F = By — U B,,. 
n=1 


Clearly F is nonempty, since ®(B)) < 0 while O(B,) > 0 for all n > 1. 
Moreover, F is negative by construction (think things through). Hence 
the set 4 = A- U Fis again negative and ®(A) < a, which is impossible. 
This contradiction shows that A+ = X — A~ must be positive. Jj 
Thus we can represent X as a union 
X = At U AW (2) 
of two disjoint measurable sets A+ and A-, where A* is positive and A7 is 
negative with respect to the charge ©. The representation (2) is called the 
Hahn decomposition of X, and may not be unique. However, if 
X = Af U A}, X = A} UAT 
are two distinct Hahn decompositions of X, then 
O(E 1 Ay) =D(EON A), O(E NAT) = O(E +O Af) (3) 
for every Ee &. In fact, 


EO (AT — Ay) SC EQN AZT (4) 
and at the same time 
EO(AT — AZ)C EOC Af. (5) 
But (4) implies 
O(E A AGT — Az)) < 9, 
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while (5) implies 

DE A (Az — Az)) > 0. 
Therefore 

MCE ON (Ay — Az)) = 0, (6) 
and similarly 

@O(E (AZ — AZ)) = 0. (7) 


It follows from (6) and (7) that 
O(E AO Ay) = O(E N(Ay — A2)) + PE (AZ OO A2)) 
= O(E 1 (Az — Ay) + DCE N (Ay 29 AQ)) = DE 2 AZ), 


which proves the first of the formulas (3). The second formula is proved 


in exactly the same way. 
Thus a charge ® on a space X uniquely determines two nonnegative set 


functions, namely 
O+(E) = D(E N At), O-(E)= —O(EN A), 


called the positive variation and negative variation of , respectively. It is 
clear that 
lh) O=O+-O; 
2) D+ and M are nonnegative o-additive set functions, i.e., measures; 
3) The set function |D| = O+ + D-, called the total variation of ®, is 
also a measure. 
The representation 
® = © —- O- 
a charge ® as the difference between its positive and negative variations 
is called the Jordan decomposition of ®. 


34.2. Classification of charges. The Radon-Nikodym theorem. We now 
classify charges on a space X equipped with a measure: 


DEFINITION 3. Let u be a o-additive measure on a o-algebra SF, of 
(u-measurable) subsets of a space X, and let be a charge defined on F,. 
Then ® is said to be concentrated ona set AE F, if P(E) = 0 for every 
measurable set E<— X — A. 


DEFINITION 4. Let u, %,, X and ® be the same as in Definition 3. 
Then © is said to be 


1) Continuous if D(E) = 0 for every single-element set ES. X of 
measure zero; 
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2) Singular if D is concentrated on a set of measure zero; 

3) Discrete if D is concentrated on a finite or countable set of measure 
zero; 

4) Absolutely continuous (with respect to w) if P(E) = 0 for every 
measurable set E such that w(E) = 0. 


Clearly, the Lebesgue integral 
O(E) = | oC) du 


of a fixed summable function ¢ is absolutely continuous with respect to the 
measure uw. As we will see in a moment, every absolutely continuous charge 
can be represented in this form. But first we need the following 


LeMMA. Let be a o-additive measure defined on a o-algebra SF, of 
subsets of a space X, and let D be another such measure defined on &.. 
Suppose © is absolutely continuous with respect to and is not identically 
zero. Then there is a positive integer nandaset A € SF, such that u(A) > 0 
and A is positive with respect to the charge D — (1/n)p. 


Proof. Let 
X=A, WAL “= 1,25,.;) 


be the Hahn decomposition corresponding to the charge ® — (1/n)u, 
and let 
A=NA,, AZ =UAt. 
n=l n=1 
Then j 
Be tek rete 
O(AQ) < Pca, 


for all n= 1,2,...,ie, O(4,)=0, and hence (At) > 0 since 
X = A> U At and © is not identically zero. But then u(At) > 0, by 
the absolute continuity of &. Hence there is an n such that w(At) > 0 
(why ?). This » and the set A = A* satisfy the conditions of the lemma. 


THEOREM 2 (Radon-Nikodym). Let u. be a c-additive measure defined 
onac-algebra S, of subsets of a space X, and let D be a charge defined on 
S,. Suppose ® is absolutely continuous with respect to 2. Then there is a 
u-summable function ~ on X such that 


O(E) = | 9x) du (8) 


for every Ee F,. The function ¢ is unique to within its values on a set 
of u-measure zero. 
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Proof. We can assume that ® is not identically zero, since otherwise 
we need only choose ¢ to be any function equal to zero almost everywhere 
(discuss the uniqueness of 9 in this case). Let K be the set of all u- 
summable functions on X such that 

[, £0) du < ®() 
for every Ee X&,, and let 


M = du. 
sup | f(s) du 


Moreover, let {f,,} be a sequence of functions in K such that 


euiea i yin) du = M, (9) 
and let 
2,(x) = max {fi(x), ... ,f,(x)}. 
Then clearly 
a(x) < g(x) <°+' < g(x) <--:. 
Moreover, 


J, 8x) du. < OCB) (10) 


for every Ee #. In fact, E can be written in the form 
E = UE,. 
k=1 


where the sets £,,...,£, are pairwise disjoint and g,(x) = f,(x) on 
E,, and hence 


[a(x du => J filx) du < DO(E,) = O(2). 
o k=1 ° 2 k=1 
In particular, it follows from (10) that g,, ¢ K, so that 

le g,(x)du< M. 
But then 

um ie n(x) du = M, 
since otherwise 
tim J fae) du <lim J ay(x) du <M, 

contrary to (10). Writing 

(x) = sup g,(x), 
we find that 

p(x) = lim g,,(x), 
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and hence, by Levi’s theorem (Theorem 2, p. 305), 


J.9@) du =lim J ene) du = M. (11) 


Next we show that ¢ is the required function, figuring in the repre- 
sentation (8). By construction, the set function 


ME) = ®(E) — | 9x) du 


is nonnegative and in fact is a o-additive measure. If A(E) = 0, then, 
by the lemma, there is anc > 0 and a set Ae & such that u(A) > 0 


and 
eu(E OM A) < (EO A) 


for every Ee F. Let 
h(x) = (x) + ex4(), 
where?! 


) I if x € A, 
xXx) = 

= 0 if x € A. 
Then 


[_Ace) du = [9 du + eu(E A) 


< |] (2) du + (E 9 A) < O(8), 


so that A belongs to the set K introduced at the beginning of the proof. 
On the other hand, it follows from (11) that 


J AC) du = | 9G) du + eu) > M, 


contrary to the definition of M. Therefore A(£) = 0, which is equivalent 


to (8). 
Finally, to prove that @ is unique to within its values on a set of 


measure zero, suppose 
O(E) = | oe) du = | 9%) du 


for all Ee 4“. Then, by Chebyshev’s inequality (Theorem 5, p. 299), 
we have 


wAn) < mJ] [o(x) — oX(=)] du =0 


11» , is called the characteristic function of the set A. 


350 DIFFERENTIATION CHAP. 9 


for every set 


Any = [=: 909) — 9X(x) > = 125.5), 
m 
and similarly 


u(B,) = 0 
for every set 


Bice: [x: 9%) eS = Coon en 
n 
But 


{x: 9(x) # 9*(x)} = (U An U (U B,), 
and hence ° " 
u{x: o(x) # p*(x)} = 0, 


i.e., p(x) = —*(x) almost everywhere. § 


Remark 1. The function ¢ figuring in the representation (8) is called the 
Radon-Nikodym derivative (or simply the density) of the charge ® with 
respect to the measure uw, and is denoted 


d® 
du 
Clearly, Theorem 2 is the natural generalization of Lebesgue’s theorem 
(Theorem 6, p. 340), which states that an absolutely continuous function 
F is the integral of its own derivative F’. However, in the case of a function 
F defined on the real line there is an explicit procedure for finding the 
derivative of F at a point x9, namely evaluation of the limit 
lim AF a F(x) + Ax) — F(x) 
Az+0 AX Art Ax 
whereas the Radon-Nikodym theorem only establishes the existence of the 
derivative d®/du, without telling how to find it. However, an explicit 


procedure can be given for evaluating d®/du at a point x9 € X by calculating 
the limit 


lim O(A.) P 
250 u.(A-) 


where {A,} is a system of sets “converging to the point x)” as ¢—0, ina 
suitably defined sense.” 


1? For the details, see G. E. Shilov and B. L. Gurevich, Integral, Measure and Deriv- 
ative: A Unified Approach (translated by R. A. Silverman), Prentice-Hall, Inc., Englewood 
Cliffs, N.J. (1966), Chap. 10. 
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Remark 2. It can also be shown® that an arbitrary charge ® has a unique 
representation as the sum 


O(E) = A(E) + S(E) + D(E) 


of an absolutely continuous charge A, a singular charge S and a discrete 
charge D. This is the exact analogue of the Lebesgue decomposition on 
p. 341. 


Problem 1. Given any charge ® defined on a o-algebra Y, prove that 
there is a constant M > 0 such that |®(£)| < M forall Ee F. 


Problem 2. Give an example of two distinct Hahn decompositions of a 
space X. 


Problem 3. Prove that a charge ® vanishes identically if it is both 
absolutely continuous and singular with respect to a measure wu. 


Problem 4. Prove that if a charge ® is concentrated on a set A, then so 
are its positive, negative and total variations. 


Problem 5. Prove that 


a) Every absolutely continuous charge is continuous; 
b) Every discrete charge is singular. 


Problem 6. Prove that if a charge ® is absolutely continuous (with 
respect to a measure 2), then so are its positive, negative and total variations. 


Problem 7. Prove that if a charge © is discrete, then there are no more 


than countably many points x,, X,,...,%,,... and corresponding real 
numbers h,, hy,...,4,,... Such that w({x,}) = 0 and 
O(E) = > A,,. 
2nEH 


Write expressions for the positive, negative and total variations of ®, 


Problem 8. Let X be the squareO< x < 1,0 < y < 1 equipped with 
ordinary two-dimensional Lebesgue measure u, and let ®(£) be the ordinary 
one-dimensional Lebesgue measure of the intersection of £ with the interval 
0<x<_1. Prove that ® is continuous and singular, but not absolutely 
continuous. 


8G. E. Shilov and B. L. Gurevich, op. cit., Chap. 9. 
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MORE ON INTEGRATION 


35. Product Measures. Fubini’s Theorem 


The problem of reducing double (or multiple) integrals to iterated integrals 
plays an important role in classical analysis. In the Lebesgue theory, the key 
result along these lines is Fubini’s theorem, proved in Sec. 35.3. En route 
to Fubini’s theorem we will need the preliminary topics treated in Secs. 35.1 
and 35.2, which are also of interest in their own right. 


35.1. Direct products of sets and measures. By the direct (or Cartesian) 
product of two sets X and Y, denoted by X x Y, we mean the set of all 


ordered pairs (x, y) where x € X, y € Y. Similarly, by the direct product of 
n sets X,, Xo,..., X,, denoted by 


Ke Ng OE UX (1) 
we mean the set of all ordered n-tuples (x), x2,...,X,), where x, € Xj, 
Xo € Xo,... , X, € X,. In particular, if 

X,=X%,=:::=X,=X, 


we write (1) simply as X”, the “‘nth power of X.” 


Example 1, Real n-space R” is the nth power of the real line R’, as 
anticipated by the notation. 


Example 2. The unit cube J” in n-space, i.e., the set of all elements of R” 
with coordinates satisfying the inequalities 


0<x,< 1 (ea 2 yang), 
is the nth power of the closed unit interval J' = [0, 1]. 


352 
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Now let A, 4,..., %, be systems of subsets of the sets Xj, Xo,.. 
X,,» tespectively. Then by 


SF x xo OS, 


we mean the system of subsets of the direct product (1) which can be 
represented in the form 
A= A, X Ap X°*° x A,, 


where 
A, € &, (k =1,2,...,n). 
If 
LuxS m Ha SP=F, 
then © is the “nth power of ”,” written 
G= FF", 


For example, the system of all closed rectangular parallelepipeds in R” is the 
nth power of the system of all closed intervals in R’. 


THEOREM 1. If A, A,..., KF are semirings, then so is the set 
G2 Fi MFe Ke x 


Proof. By the definition of a semiring (see p. 32), we must show that? 


a) If A, BES, then AN BEG; 
b) If A, Be SG and BC A, then A can be represented as a finite 
union 


4=Uc# 


k=l 
of pairwise disjoint sets C™ € ©, with B= CW. 


It is clearly enough to prove these assertions for the case n = 2. Thus 
suppose AE FX 4, BC AX SF. Then 


A= A, xX A, (A, € A, A, € SH) 


y) 
B= B, X B (Be A, Be KH), o 


and hence 
ANB= (A, X Ag) OA (B, X B) = (Ay 1 AQ) X (Ag 1 B,). 


But 4, VN Bye A, Ap OBE HA, since A and YF are semirings. It 
follows that A N Be A xX SH. This proves a). 
To prove b), suppose that 


B, cS Ay, By = Ag, 


1 Note that the empty set @ belongs to G, since @ = @ X OG xX-+-+ X S (why?). 
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in addition to (2). Then, since “and % are semirings, there are finite 
expansions 


A, =B, UBM Ue VU BY 
A, = B, U BY U-++ U BY, 
where the sets B,, BY, ..., Bi are pairwise disjoint and belong to %, 
while the sets B,, BO’,..., BS? are pairwise disjoint and belong to %. 
Therefore 
A=A, xX A, =(B, X Bz) U(B, X BY’) U-+- U(B, x BS”) 
U (BY? x By) U(ByY x BS’) U-++ U (By? x BY”) 
U++ UO (By? x B,) U (By? x BS’) U++» U(BY x BY”) 
is the desired finite expansion of A; X A,, where B, X B, is the first term 


and the other terms are pairwise disjoint and belong to G = 
AX SA | 


Now let 4, 4,..., F% ben semirings, equipped with measures 


U4(Ay), U2(Ae), -- 5 Un(An) (A, € 4), (3) 


and let u be the measure on the semiring G= Ax FZx-::X F 
defined by the formula 


(A) = Yy(Ay)pe(A) °° Ba(An) 


forevery A = A, X Ay X *** X Ay. Then wiscalled the direct (or Cartesian) 
product® of the measures (3), and is denoted by 


= Uy X Ug X 7°" K Uy. 


To confirm that u is indeed a measure, we now show that u is additive (u is 
obviously real and nonnegative). It will again be enough to consider the 


case n = 2. Suppose 
t 
A=A,xX A, =U B"™, (4) 
k=1 
where 


Bo fa B!? — % (i + jf) 
and 
B® = Bw x BY: 


According to Lemma 2, p. 33, there are finite expansions 


r 8 
A=Uc™, 4,=U Cc”, 
n=1 


m=1 


? The term product measure will be used with a different meaning below. 
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each involving pairwise disjoint sets, such that each Bi is a finite union 
BY) — U Cm 


meM>, 
of certain of the sets C’”, while each B” is a finite union 
Bw = U Cx 
neN, 


of certain of the sets C” (here M;, denotes some subset of the set {1, 2,..., 7} 
and N, some subset of the set {1, 2,..., s}). But then, by the additivity of 
Uy and wu. we have 


ia=uMucy= > 1(C) by 1a C2”) 


=> Ey wc”) & walCo") 


k=1 meM;, 
=> u,(BY”)u.(BY”) > u(B,), 


which, when compared with (4), shows the additivity of uw = wy X Us. 


Example 3. Thus the additivity of area of rectangles in the plane follows 
from the additivity of length of intervals on the line. 


THEOREM 2. If the measures U1, Wo, .- +5 kn are o-additive, then so is 
the measure wu = Uy X We X*°* X Mn. 


Proof. Again we need only consider the case n = 2. Let A, denote the 
Lebesgue extension of the measure u,, and suppose 


C= UC,, 


n=] 


where the sets C,, are pairwise disjoint and the sets C,C, belong to 
Fy X Fay Lee C=AxB (Ae LF, Be SF), 


Cyr=A,X Bi, (4,€ A,B, € A). 
Moreover, let 


fal x) = ad if xe An, 


if x €A,. 
We then have 


S fi) = =y,(B) if xeA, 


n=l 


and hence, by the corollary on p. 307, 


> I, Trl) dda = | _Ho(B) dry = 4(A)u2(B) 
= 1(A)uo(B) = pC). (5) 


356 MORE ON INTEGRATION CHAP. 10 


But 
[ fal) ds = w(A,)uo(B,) = w(C,). (6) 


Substituting (6) into (5), we get 
w(C) = 2 u(C,). I 
Again let A, %,..., A be n semirings, this time equipped with 
o-additive measures (3). Then it follows from Theorem 2 that the measure?® 
Mm = Uy X Ue X17" X Un (7) 
is o-additive on the semiring 
SSF Otte Fe 


Therefore, as in Sec. 27, m has a Lebesgue extension u defined on a o-ring 
SF, > GS. This measure u, is called the product measure of the measures (3), 
and is denoted by 


= U1 @ We @ 61 @ Wy. (8) 


The distinction between the meaning of the symbols x and © in (7) and 
(8) is crucial. 


Example 4. Let 
My = We = = Un = BM, 
where yu! is ordinary Lebesgue measure on the line. Then the product 
measure (8) is ordinary Lebesgue measure in n-space. 


35.2. Evaluation of a product measure. Let G be a region in the xy-plane 
bounded by the vertical lines x = a, x = b (a < b) and the curves y = f(x), 
y = g(x), where f(x) < g(x). Then it will be recalled from calculus that the 
area of G is given by the integral 


[le — f@)] ax, 


where the difference g(x 9) — f(%o) is just the length of the segment in which 
the vertical line x = x, intersects the region G. As we now show, the natural 
generalization of this method can be used to evaluate an arbitrary product 
measure: 


THEOREM 3. Let uw be the product measure 
U = Uz ®@ Wy, 


> We change to the symbol m here, to “free” » for use in formula (8). 
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of two measures u, and w, such that 


1) u, is o-additive on a Borel algebra F,, of subsets of a set X; 
2) u, is -additive on a Borel algebra FS, of subsets of a set Y; 
3) wu, and w, are complete, in the sense that B < A and p,(A) = 0 
implies that B is measurable (with measure zero), and similarly for 
My." 
Then 


WA) = | ty(Ae) dite = | teal Ay) diy (9) 


for every w-measurable set A, where® 
A, ={y:(x,))€4} (x fixed), 
A, = {x:(%,y)E€ A} (fixed). 
Proof. We note in passing that the integral over X in (9) reduces to 
an integral over the set of the form 


UA,c X 
y 


outside which u,(A,) vanishes (and similarly for the integral over Y). 
It will be enough to prove that 


aA) = | 24%) dite (10) 


where 
9.4(x) = U,(A,) ) 


since the other part of (9) is proved in exactly the same way. Observe 
that implicit in the theorem is the conclusion that the set A, is u,-measur- 
able for almost all x (in the sense of the measure u.,) and that the function 
p4(x) is u,-measurable, since otherwise (10) would be meaningless. 
The measure pu is the Lebesgue extension of the measure 

mM = Ug X Wy 

defined on the semiring %,, of all sets of the form 
A=A,XA,, (AEX), 

where is the Borel algebra of u-measurable subsets of X¥ x Y. But 
(10) obviously holds for all such sets, since for them 

u(A,,) if xeA,, 

P4(x) = ; 
0 if x¢€A,. 


‘ The Lebesgue extension of any measure is complete (see Problem 7, p. 280). 

>If X is the x-axis and Y the y-axis (so that X x Y is the xy-plane), then A,, is the 
projection onto the y-axis of the set in which the vertical line x = x» intersects the set 4 
(and similarly for A,,). 
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Moreover, (10) carries over at once to the ring #(.Y%,) generated by 
SF, since A(SF,) is just the system of all sets which can be represented 
as finite unions of pairwise disjoint sets of %, (recall Theorem 3, p. 34). 

To prove (10) for an arbitrary set 4 € %,, we recall from Theorem 8, 
p. 277 that there are sets 


Bur © BCS) (Bi, S Byg G+ SC By, Se?) 
and corresponding sets 
B,=—~UB,€% (B,>B,>--'>B,>:°°°:) 
k 
such that 


A= B=f)B,, 
u(A) = w(B). (11) 
Clearly, 
B(x) = oo PBak®)>  — PBul®) < PBX) <6 °° < Op, (%) < 0°, 
(x) ecu ?p,(X), Op (x) > 9p(x)>°°° > Op,(x) >°°°. 


Hence we can invoke Levi’s theorem® to extend (10) from the ring #(-Y%,,) 
to the system of all sets Be %, of the form 


NU Bux (Bar © Fn): (12) 


n &k 


Moreover if (A) = 0, then u(B) = 0, because of (11), and hence 
Pz(x) = Uy(Bz) =0 

almost everywhere. Therefore A, is measurable and 
4X) = Yy(Ay) = 0 


for almost all x, since A, © B,. But then 


[ 94() dug = 0 = uA). 


In other words, (10) holds for all sets of measure zero, as well as for all 
sets of the form (12). But, according to (11), an arbitrary set Ae SF, 


can be represented as 
A=B—Z, 


where B is of the form (12) and Z is of measure zero. Therefore 


B=AUZ (ANZ=2). 


® See Theorem 2, p. 305 and Problem 2, p. 311. 
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It follows that 


w(A) = B(B) = | on() dite 


= | pale) dite + [ 920%) dite =f 940) dite 
i.e., (10) holds forevery Ac %. ff 


Example 1. Let M be any u,-measurable set, and let f be an integruble 
nonnegative function. Moreover, let Y be the y-axis, and let u, be ordinary 
Lebesgue measure on the line. Consider the set 


A={(x,y):xEM,0<y< f(x}. (13) 
Then 
fe) ifxeA, 


=o (Ay= 
9 4(x) Uy ( a) ' ifx ¢ A, 


and hence, by Theorem 3, 


uA) = | eas) de = | fC) de (14) 


This allows us to interpret the Lebesgue integral of a nonnegative function 
over a set MC X in terms of the u-measure of the set (13), where u = 


Ly ® Uy. 


Example 2. In the preceding example, let X be the x-axis and let M be a 
closed interval [a,b]. Moreover, suppose f is nonnegative and Riemann- 
integrable on [a, b]. Then (14) reduces to the familiar formula 


u(A) = f(x) dx 
for the area under the graph of the function y = f(x) between x = a and 
x = 5b. 


35.3. Fubini’s theorem. The next theorem is basic in the theory of 
multiple integration: 


THEOREM 4 (Fubini). Let u., and wu, be the same as in Theorem 3, let u 
be the product measure wu, ® w,, and let f(x, y) be u-integrable on the set 
Ac xX xX Y. Then 


i, T(x, y)du= a ([,.f y) dy) du, = ie ({,,f y) du.) duy. (15) 


Proof. Note that implicit in the theorem is the conclusion that the 
‘inner integrals”’ in parentheses exist for almost all values of the variable 
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over which they are integrated (x in the first case, y in the second). We 
begin by assuming temporarily that f(x, y) > 0. Consider the triple 
Cartesian product 
U=Xx YxZ, 

where Z is the real line, equipped with the product measure 

Uy = Ue @ by @ ut =p @ ut = uw, @ (u, @ uw’) 
(see Problem 3), where yu! is ordinary Lebesgue measure on the line. 
Moreover, consider the set W < U defined by 


W = {(x, y,2):xX €A,, yEA, OK 2z< f(x, yy}. 


By (14), 
wuW) = |G, y) du. (16) 
On the other hand, by Theorem 3, 
wu) = | AW) dite (17) 
where 
A= py @ ws 


W, = {(, 2): (x, y, 2) € W} (x fixed). 
Using (14) again, we obtain 
NW) = [1G ) dy (18) 


Comparing (16)-(18), we get part of (15). The rest of (15) is proved in 
exactly the same way. To remove the restriction that f(x, y) be non- 
negative, we merely note that 

SO, y) = f(x y) SOX; y)s 


where the functions 


tx, v= f(x, ao TO, y) 
ox, ) =| en = so» 


2 
are both nonnegative. Jj 


Remark. Thus Fubini’s theorem asserts that if the ‘double integral” 


i [fe y) du (19) 


exists, then so do the “iterated integrals”’ 


tes = J(J, F059) dey) dtr ne = J ([, S05 9) dite) ity (20) 


and moreover J = I,, = I,,. 
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Problem I. Give an example of a set in R? which is not a direct product 
of any two sets in R’. 


Problem 2. Prove that the direct product of two rings (or o-rings) need 
not be a ring (or o-ring). 


Problem 3. Given three spaces X, Y and Z, equipped with measures 
Ue» U4, and w,, respectively, prove that (u, ® u,) @ yu, and p, © (4, ® u,) 
are identical measures on X X Y X Z. 


Problem 4. Let A = (—1, 1] x [—1, 1] and 


xy 
VC) 
f( y) xe =e yy 
Prove that 
a) The iterated integrals (20) exist and are equal; 


b) The double integral (19) fails to exist. 
Hint. Since 
1 1 
ie f(x, y) dx = i: I(x, y) dy = 9, 


we have 


ie (I" f(x, y) ax) dy = RL. f(x; y) ay) dx = 0. 


On the other hand, the double integral fails to exist, since 


[’ Fite. yl ax dy > [iar{s sin 008 8 a _ afi = 


after transforming to polar coordinates. 


Problem 5. Let A = [0, 1] x [0, 1] and 


2 if a ge? gn SVS Gn? 

f(x, y) = gen ies ol 1 1 1 
aoe if eo ee pet? 
0 otherwise. 


Prove that the iterated integrals (20) exist but are unequal. 


Ans. (ire. y) ax) a8. (ies ) ay] dx = 1. 


Problem 6. The preceding two problems show that the existence of the 
iterated integrals (20) does not imply either the existence of the double 
integral (19) or the validity of formula (15). However, show that the 
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existence of either of the integrals 


SAL, If au, dun | ( ice) du.) du, (21) 


implies both the existence of (19) and the validity of (15). 


Hint. Suppose the first of the integrals (21) exists and equals M. The 
function 


Tnx, y) = min {| FO, yl, n} 


is measurable and bounded, and hence summable on A. By Fubini’s theorem, 


| fale y) du = I. (foe y) du,) dun < M. 


Moreover, {/,(x, y)} is a nondecreasing sequence of functions converging 
to | f(x, y)|. Use Levi’s theorem to deduce the summability of | f(x, y)| 
and hence that of f(x, y) on A. 


Problem 7. Show that Fubini’s theorem continues to hold for the case of 
o-finite measures (cf. Sec. 30.2). 


36. The Stieltjes Integral 


36.1. Stieltjes measures. Let F be a nondecreasing function defined on a 
closed interval [a,b], and suppose F is continuous from the left at every 
point of (a, b]. Let / be the semiring of all subintervals (open, closed or 
half-open) of [a, 5), and let m be the measure on / defined by the formulas’ 


m(«, 8) = F(B) — F(a + 0), 
m[a, B] = F(B + 0) — F(a), 
m(a, B] = F(B + 0) — F(a + 0), 
m[a, 8) = F(B) — F(a). 
Finally, let up be the Lebesgue extension of m, defined on the o-algebra 
Si, of u»-measurable sets. In particular, Sie contains all subintervals of 


[a, b) and hence all Borel subsets of [a, b). Then wp is called the (Lebesgue-) 
Stieltjes measure corresponding to the function Ff, and the function F itself 


is called the generating function of wp. 


Example 1. The Stieltjes measure corresponding to the generating func- 
tion F(x) = x is just ordinary Lebesgue measure on the line. 


(1) 


? To avoid confusion, we omit “outer parentheses,” writing u(«, 8B) instead of u((«, )), 
and similarly in the rest of the formulas (1). Moreover, in m{«, 8], we allow the case 


“=f. 
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Example 2. Let F be a jump function, with discontinuity points 


X1,Xq,...,X,,-.- and corresponding jumps /y, ho,...,h,,... . Then 
every subset A © [a, b) is up-measurable, with measure 
und) = & hy. (2) 


In fact, according to (1), every single-element set {x,,} has measure h,,, and 
moreover it is clear that the measure of the complement of the set {x,, 
Xo,-++5%Xy,---} is zero, But then (2) holds, by the o-additivity of up. A 
Stieltjes measure up Of this type, generated by a jump function, is said to be 
discrete. 


Example 3. Let F be an absolutely continuous nondecreasing function on 
[a, b), with derivative f= F’. Then the Stieltjes measure uz is defined on 
all Lebesgue-measurable subsets A < [a, 5) and 


und) = | f(x) de. (3) 
In fact, by Theorem 6, p. 340, 


pn(a, B) = F(B) — F(a) = [° fx) dx (4) 


for every open interval (a, 8). But then (3) holds for every Lebesgue- 
measurable set A < [a, 5) since a Lebesgue extension of a o-additive measure 
is uniquely determined by its values on the original semiring.® A Stieltjes 
measure up Of this type, with an absolutely continuous generating function, 
is itself said to be absolutely continuous. 


Example 4. Let F be singular (and continuous) as on p. 341. Then the 
corresponding Stieltjes measure uy is concentrated on the set of Lebesgue 
measure zero where the derivative F’ is nonzero or fails to exist. A Stieltjes 
measure of this type is said to be singular. 


Example 5. By the Lebesgue decomposition (p. 341), an arbitrary 
generating function F can be represented as a sum 


F(x) = D(x) + A(x) + S(x) (5) 


of a jump function D, an absolutely continuous function A and a singular 
function S (verify that D, A and S are themselves generating functions). 
Moreover, each of the “components”’ D, A and Sis uniquely determined to 
within an additive constant (see Problem 4, p. 342). But clearly 


Ur =Up tT ba + Uy. 


® Give a more detailed argument, recalling Problem 1, p. 279. Note that in this case 


m(a, 8) = mlx, B] = ma, B] = m[a, B). 
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Jt follows that an arbitrary Lebesgue-Stieltjes measure can be represented 
as a sum of a discrete measure up, an absolutely continuous measure uw, and 
a singular measure ug. Moreover, this representation is unique (why ?). 


Remark. We can easily extend the notion of a Stieltjes measure on a 
(finite) interval [a, 5) to that of a Stieltjes measure on the whole line (— 00, 00). 
Let F be a bounded nondecreasing function on (— oo, 00), so that 


m< F(x)<M (—-w<x< ow). 


Using the formulas (1) to define the measure of arbitrary intervals (open, 
closed or half-open), not just subintervals of a fixed interval [a, b), we get a 
finite measure uy on the whole line, called a (Lebesgue-) Stieltjes measure, 
as before. In particular, we have 


4(— 00, 00) = F(oo) — F(— ov) 
for the measure of the whole line, where 


F(oo) = lim F(x), F(— 00) = lim F(x) 


&->—00 
(the existence of the limits follows from the fact that F is bounded and 
monotonic). 


36.2. The Lebesgue-Stieltjes integral. Let up be a Stieltjes measure on 
the interval [a, b), corresponding to the generating function F, and let f be 
a up-summable function. Then by the Lebesgue-Stieltjes integral of f (with 
respect to F), denoted by 


i f(x) dF(x), (6) 


we simply mean the Lebesgue integral 
io £9) dup. 


Example 1. Let F be the jump function 
F(x)= Y Ay, 


Lnun<Tz 


so that up is a discrete measure. Then (6) reduces to the sum 
DLS %n)An- 
Example 2. If F is absolutely continuous, then 


[¢@) aF@) = | f)F'@) ax, (7) 
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where the right-hand side is the integral of fF’ with respect to ordinary 
Lebesgue measure on the line. In the case where f(x) = const, this is an 
immediate consequence of (4). Moreover, by the o-additivity of integrals, 
(7) can be extended to the case of any simple function f which is y,z- 
summable. More generally, let {/,,} be a sequence of such simple functions 
converging uniformly to /, so that {/,,F’} converges uniformly to fF’. It can 
be assumed without loss of generality that 
Ai) < fo) << fh) <ct-, 

and hence that 

AiOMFO) < foO)FO) < +++ <frQ)FQ) <0. 
Therefore, applying Levi’s theorem (Theorem 2, p. 305) to both sequences 
{fn} and {f,F"}, we get 


[PfG6) dF) = lim |? f,)dF@) = lim |? FG) F'@) dx = J? f)F'G) dx. 


Example 3. Suppose 
F(x) = D(x) + A(x), 
where D is the jump function 


D(x) ae > h,, 


&n <2 


and A is absolutely continuous. Then it follows from Examples 1 and 2 that 


ry (x) dF(x) = 3 f(x,)hy + i f(x)A'(x) dx. 


In the case where F also contains a singular component, as in (5), there is no 
such representation of the Lebesgue-Stieltjes integral (6) as the sum of a series 
and an ordinary Lebesgue integral. 


Remark. We can easily extend the notion of a Lebesgue-Stieltjes integral 
with respect to a nondecreasing function F to that of a Lebesgue-Stieltjes 
integral with respect to an arbitrary function of bounded variation ®. In 


fact, as in Theorem 4, p. 331, let 

PP =v—g, 
where v, the total variation of ® on the interval [a, x], and g = v — ® are 
both nondecreasing. We then set 


[? £0) a0) = J £0) aoe) — [° 70) deta) (8) 
by definition (see Problem 2). 


36.3. Applications to probability theory. The Lebesgue-Stieltjes integral 
is widely used in mathematical analysis and its applications. The concept 
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plays a particularly important role in probability theory. Given a random 
variable €,° let 


F(x) = P{E < x}, 


i.e., let F(x) be the probability that € takes a value less than x. Then F is 
clearly nondecreasing and continuous from the left. Moreover, F satisfies 
the conditions 

F(—o) = 0, F(co) = 1 


(why ?). Conversely, every such function f can be represented as the prob- 
ability distribution of some random variable &. 

Two basic numerical characteristics of a random variable & are its 
mathematical expectation or mean (value) 


EE = ‘ae dF(x), (9) 


and variance 


DE = |" (x — BE) F(x) (10) 
(however, see Problem 5). 


Example 1. A random variable & is said to be discrete if it can take no 
more than countably many values x,, X2,...,Xy,... . For example, the 
number of calls received on a given telephone line during a given time 
interval is a discrete random variable. Let 


Pa = PLE = Xp} (n= 1,2,...) 


be the probability of the random variable & taking the value x,. Then the 
distribution function of & is just the jump function 

F(x) = 2 Pu 
In this case, the integrals (9) and (10) for the mean and variance of & reduce 
to the sums 


ES = > XnPw 
DE=2(x%,—4)"P, (a = E6). 
Example 2. A random variable @ is said to be continuous if its distribu- 
tion function F is absolutely continuous. The derivative 
P(x) = F(x) 


®* We presuppose familiarity with the rudiments of probability theory. See e.g., Y. A. 
Rozanov, Introductory Probability Theory (iranslated by R. A. Silverman), Prentice-Hall, 
Inc., Englewood Cliffs, N.J. (1969). 
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of the distribution function is then called the probability density of §. It 
follows from Example 2, p. 364 that in this case the integrals (9) and (10) 
for the mean and variance of € reduce to the following integrals with respect 
to ordinary Lebesgue measure on the line: 


EE = | * xp(x) dx, 
DE = lee —a)p(x)dx (a=KEé&). 


36.4. The Riemann-Stieltjes integral. Besides the Lebesgue-Stieltjes inte- 
gral introduced in Sec. 36.2 (which is in effect nothing but the difference 
between two ordinary Lebesgue integrals with respect to two measures on the 
real line?°), we can also introduce the Riemann-Stieltjes integral, defined 
as a limit of certain approximating sums, analogous to those used to define 
the ordinary Riemann integral. To this end, let fand ® be two functions on 
(a, b], where ® is of bounded variation and continuous from the left, and let 


A= Xj ye he SSH 


be a partition of the interval [a, b] by points of subdivision X9, x1, X2,..., 
x,- Choosing an arbitrary point &, in each subinterval [x,_1, x,], we form 
the sum 


2 F (EMP) — P%1)].- (11) 
Suppose that as the partition is “refined,” i.e., as the quantity 
max {X, — Xp, Xq — Xy,.-+5Xn — Xn_r} (12) 


(equal to the maximum length of the subintervals) approaches zero, the sum 
(11) approaches a limit independent of the choice of both the points of 
subdivision x, and the “intermediate points” €,. Then this limit is called 
the Riemann-Stieltjes integral of f with respect to M, and is denoted by 


[° Fe) dO() 
(just as in the case of the Lebesgue-Stieltjes integral). 
Remark. If D = D, + ®,, then 


[ef (x) dD(x) = lle (x) dD,(x) + ine) (x) d(x) (13) 


(provided the integrals on the right exist). In fact, we need only write the 


10 Recall formula (8). 
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identity 
S SE NO) — OG) 

= 3 (ENG) — Pier N+ E (ENO) — OO, 


and then pass to the limit as the quantity (12) approaches zero. 


THEOREM 1. If f is continuous on [a, 6], then its Riemann-Stieltjes 
integral exists and coincides with its Lebesgue-Stieltjes integral. 


Proof. The sum (11) can be regarded as the Lebesgue-Stieltjes integral 
of the step function 


Fn(x) = & if Xpy< xX < x, (kK =1,...,%). 
As the partition of [a, 5] is refined, the sequence { /,} converges uniformly 


to f (why?). Hence, by the very definition of the Lebesgue integral 
(recall p. 294), 


lim p f(x) dx =I, 


where J is the Lebesgue-Stieltjes integral of f over [a, b). But then 


Lim > f(x Ox) — Ox, .)] = 1 


n> Ol=]1 
where the limit on the left is the Riemann-Stieltjes integral of f over 
[a,b]. fi 
THEOREM 2. If f is continuous on [a, b], then 
| (P72) 40) | < VCP) max If) (14) 
where V?(®) is the total variation of D on [a, b}. 


Proof. The inequality 
& (ENO) — OHI 


<2If (E1)| |[O(x,) — P(x,_v)| 
< max | f(x)! Y{®(x,) — Gy.) < VO(f) max [fo 
axarxd k=1 axsergb 
holds for any partition of the interval [a, 5]. Taking the limit of the 
left-hand side as max {x, — X9,... 5%, — Xna}— 0, we get (14). Jj 
Remark. If D(x) = x, (14) reduces to the familiar estimate 


[2 70) dx| < @ = a) max |f()| 


<LQb 


for the ordinary Riemann integral. 


SEC, 36 THE STIELTJES INTEGRAL 369 


THEOREM 3. Let ® be a function of bounded variation on [a, b), different 


from zero at no more than countably many points cy, Cz,...+5Cny--+ in 
(a, b). Then 


‘i f(x) d®(x) = 0 (15) 
for any function f continuous on [a, 5}. 


Proof. The assertion is obvious if ® is nonzero at only a single point 
cC, € (a, b), since then 


24 (x, )[P(x;) — O(x,_1)] = 0 
for an “arbitrarily fine’’ partition 
A Xp hy Re eX 


i.e., a partition for which the quantity (12) is arbitrarily small, provided 
we make sure that c, is not one of the points of subdivision Xo, x4,..., 
x,.11 Hence, by (13), the assertion is also true if ® is nonzero at only 
finitely many points in (a, b). Now suppose ® is different from zero at 
countably many points 

Oi Corind kiCrs cee 


in (a, 5), and let 


n = D(C,). 
Then 
2 |Val <0, 


since ® is of bounded variation. Given any e > 0, we choose N such that 


00 


X Iyal<e, 
; : n=N+1 
and write ® in the form 


® = Oy + O*, (16) 


where Dy takes the values y,,..., yy at the points c,,...,Cy and Is 
zero elsewhere, while D* takes the values yy.1, Vyio,... at the points 
Cy+1s CNi2,-.- and is zero elsewhere. Then, as just shown, 


ii f(x) d®y(x) = 0. (17) 


Moreover 


S, FEO") — OD] < 2M_Y yal < 2Me, 


11 Note that here we rely on the fact that c; is not an end point of [a, 5]. 
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wher: 
: M = max |f(x)|, 


axrtap 
OT 


| [° f(x) db*(x) | < 2Me 


after taking the limit as m — oo. This in turn implies 


‘i f(x) d®(x) = 0, (18) 


since « > 0 is arbitrary. Formula (15) now follows at once from (13) 
and (16)-(18). jj 


36.5. Helly’s theorems. In Sec. 30.1 we found conditions insuring the 
validity of passing to the limit in Lebesgue integrals, 1.e., conditions under 


hich 
a tim f fda = | f) du, (19) 


where {f,,} is a sequence of functions converging (almost everywhere) to a 
function f and the integrals are all with respect to a fixed measure uw. In 
the case of Stieltjes integrals, we now ask a closely related but somewhat 
different question: Under what conditions does the formula 


lim [’ f(x) d®,(x) = [’ f(x) d®(x) (20) 


hold, where fis continuous and {®,} is a sequence of functions of bounded 
variation converging (everywhere) to a function ®? (Note that here, unlike 
(19), the function fis fixed, and it is the function ®,,, or the corresponding 
Stieltjes measure, which varies.) The answer to this question is given by 


THEOREM 4 (Helly’s convergence theorem). Let {P,} be a sequence of 
functions of bounded variation on [a, 6], converging to a function ® at every 


point of [a,b|. Suppose the sequence of total variations {V°(®,)} is 
bounded, so that 


VAD, <C (n=1,2,...) (21) 
for some constant C > 0. Then ® is also of bounded variation on [a, b}, 
and (20) holds for every function f continuous on [a, b}. 
Proof. Let 
A= Xp yee te = = 0 


be any partition of the interval [a, b] by points of subdivision xo, x;,... 
Xm Then 


LOG) — Oa) = lim F106) — Geral < C, 
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and hence 
Vi) < C, (22) 


i.e., M is of bounded variation on [a, b], as asserted. 
Next we show that (20) holds if fis a step function. Suppose 


Ff) = h, if Xy-y < Xx < Xp 
Then 


[’ f(x) d®,,(x) = x h,[D,,(x;) — ®,,(x,_1)] (23) 


and!” 


J? A) dO) = AOC) — OH] (24) 


where obviously (23) approaches (24) asm — co. Nowlet f becontinuous 
on [a, b]. Given any « > 0, choose a step function f| such that 


f@O)-fO<s2 @<x<b) (25) 
(why is this possible?). Then 


f° fe ao) — [° 70) 40, |< Ih FI til 29 
where 


1, = ]? f(x) d®(x) — J” £02) dO(o), 
Ie = [? f(x) a(x) — [°K d®, (x), 


Is = }” f(x) a®,(x) — [” f(x) €®,(2). 


By the inequality (14), which clearly holds for Lebesgue-Stieltjes integrals 
as well as for Riemann-Stieltjes integrals (why ?), we have 


Li < |’ifGe) — £0] d(x) < = V0) < =, 
nl < PY) ~ £1 de) < = VA) < § | 
b € & (27) 
Mal < °K) — SO) dO) <5 V2@,) < &, 
after using (21), (22) and (25). Moreover, as just shown, 
el <= (28) 


3 
#2 Think of (23) and (24) as Lebesgue-Stieltjes integrals. 
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for sufficiently large n. It follows from (26)~(28) that 


PFC) a(x) — J? 7) d,0)| <s, 


which implies (20), since « > O is arbitrary. § 


Theorem | gives conditions under which we ean take the limit of a se- 
quence {®,} of functions of bounded variation inside a Stieltjes integral. 
The next theorem gives conditions guaranteeing the existence of a sequence 
{®,,} meeting the requirements of Theorem 4. 


THEOREM 5 (Helly’s selection principle). Let D be a family of functions 
defined on an interval [a, b] and satisfying the conditions 


Vi() < C, sup 19(%) <M (29) 


for suitable C and M. Then ® contains a sequence which converges for 
every x € [a, b]. 


Proof. It is enough to prove the theorem for nondecreasing functions. 
In fact, let 
fs les 


where v is the total variation of 9 on [a, x]. Then the functions v corre- 
sponding to all ¢ € ® are nondecreasing and satisfy the conditions of 
the theorem, since 
Viv) = V9) < C, sup Jo(x)] < C. 
axsenb 
Assuming that the theorem holds for nondecreasing functions, we choose 
a sequence {¢,,} from ® such that v, converges to a limit v* on [a, 5). 
Then the functions 
En — Un — Pn 

are also nondecreasing and satisfy the conditions of the theorem (why ?). 
Therefore {p,} contains a subsequence {¢,,} such that {g,, } converges 
to a limit g* on [a, b]. But then 


lim @n,(%) = 9"), 
where a: 
p*(x) = v*(x) — g*(x). 
Thus we now proceed to prove the theorem for nondecreasing 
functions. Let r,,1o,...,%,,-.- be the rational points of [a, d]. It 


2" M3 


follows from (29) that the set of numbers 
e(n) (ge9®) 
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is bounded. Hence there is a sequence of functions {¢} converging at 


the point r,. Similarly, {g} contains a subsequence {¢~”} converging 


at the point r, as well as at r,, {9} contains a subsequence {9°} 
converging at the point rg as well as at r, andr,, and soon. The “diagonal 


sequence” 
{bn} = {92° 


will then converge at every rational point of [a, 5). The limit of this 
sequence is a nondecreasing function , defined only at the points 
ly, 1o5+++5ln>-+- - We complete the definition of { at the remaining 
points of [a, b] by setting 


v(x)= lim (r) if x is irrational. 
P adonal 


The resulting function is then the limit of {),,} at every continuity 
point of }. In fact, let x* be such a point. Then, given any « > 0, there 
is a 8 > 0 such that 


IMO") — 4001 <= (30) 
if 
Ix* — x] <8, 
Let r and r’ be rational numbers such that 
x*—S8<r <xt<r" <x* +6, 


and let 1 be so large that 
Ibalr’) — Wr’ < - dale’) = $1 <=. (31) 
It follows from (30) and (31) that 
alr) — bal <S 


Since y,, iS a nondecreasing function, we have 


Yalt’) < dns") < dalr”), 
and hence 
YO) = dale) < 1YO*) = HOD +1909 = dal 
+ 1bn() — ball < E+ 


€ 2¢ 


6 3 
Therefore 


lim aCe") = $x"), 


since ¢ > 0 is arbitrary. 
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Thus we have constructed a sequence {),,} of functions in ® con- 
verging to a limit function ) everywhere except possibly at discontinuity 
points of y. Since there are no more than countably many such points 
(why ?), we can again use the “diagonal process”’ to find a subsequence 
of {),} which converges at these points as well, and hence converges 
everywhere on [a, b]. § 


36.6. The Riesz representation theorem. Next we show how Stieltjes 
integrals can be used to represent the general linear functional on the space 
Cra.p) Of all functions continuous on the interval [a, b]: 


- THEOREM 6 (F. Riesz). Every continuous linear functional » on the 
space Cyq,5, can be represented in the form 


ef) = J" f(x) d&Cx), (32) 
where ® is a function of bounded variation on [a, b], and moreover 
loll = Ve). (33) 


Proof. The space C,,,,, can be regarded as a subspace of the space 
M,,.5; of all bounded functions on [a, b], with the same norm 


Ill = one (x)| 


as in C;, 5). Let @ be a continuous linear functional on C,, 4). By the 
Hahn-Banach theorem (Theorem 5, p. 180), ~ can be extended without 
changing its norm from C;,,,,; onto the whole space M,,,,;. In particular, 
this extended functional will be defined on all functions of the form 


| i 2s, 
T(x) = (a<t< b). (34) 
0 if x>T7 


Let 
P(t) = 9(f,). (35) 
Then ® is of bounded variation on [a, b]. In fact, given any partition 


GN ky ak HO (36) 
of [a, 5], let 


a, = sgn [O(x,) — P(x,_1)] (A =I1,...,n), 
where 
1 if x> 0, 


senx=({ 0 if x=0, 
—I if x <0. 
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S1OG) — G01 =ZalOC) — OG] 
= S% of: 7 Sf tes) ees (S20 "oe I. ~)) 


< Jol | 3 os(fa — fa 


But the function 


> (fo, = i 


k=] 


can only take the values 0, +1, and hence its norm equals |. Therefore 


IOC) — OG. < Hel 
Since this is true for any partition of [a, b], we have 


Vi(®) < lel, (37) 


i.e., D is of bounded variation on [a, b], as asserted. 

We now show that the functional ¢ can be represented in the form of a 
Stieltjes integral with respect to the function ® just constructed. Let f 
be any function continuous on [a, b]. Given any « > 0, let 5 > 0 be 
such that |x’ — x”| <6 implies | f(x’) —f(x")| < «. Suppose the 
partition (36) is such that each subinterval [x,_, x,] is of length less than 
5, and consider the step function 

f(x) =f x,) If xp xX <= (kK=1,...,H), 


which can obviously be written in the form 
f°) = FSC) — fans] (38) 
where f. is the function defined by (34). Clearly, 
If) —fP OR) <e 


for all x & [a, b],}* ie., 


If-f@'l <e. (39) 
It follows from (35) and (38) that 


AS) =SSOMSa) — a I= ESO) — OG Dh 


18 We complete the definition of f‘*) by setting f'©)(b) = f(x.) = f(b) for every « > 0. 
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i.e., p(f) is an “approximating sum” of the Riemann-Stieltjes integral 


[° Fe) a(x). 
Therefore 
| ect) — f° fe) aoQ9| <e 
for a “sufficiently fine’’ partition of the interval [a, 6]. On the other 
hand, 
lef) — oF) < Nel IFS < Mele 
because of (39). But then 


lo — J? £0) 400) | < (oll + De, 


which implies (32), since « > 0 is arbitrary. To prove (33), we merely 
combine (37) with the opposite inequality 


loll < Va), 


which is an immediate consequence of Theorem 2 and the representation 


(32). Of 


Problem I. Let % be an arbitrary finite o-additive measure on the real 
line (— oo, 00). Represent pu as the Stieltjes measure corresponding to some 
generating function F. 


Hint. Let F(x) = p(— 0, x). 


Comment. Thus the term “Stieltjes measure’’ does not refer to a special 
kind of measure, but rather to a special way of constructing a measure (by 
using a generating function). 


Problem 2. Let ® be a function of bounded variation with two distinct 
representations 0 = v — g, ® = v* — g* in terms of nondecreasing functions 
v, g, v* and g* (give an example). Prove that 


[270 dv(x) — [7 dg(x) = Pre dv*(x) — . F(x) dg*(x). 


Comment. Thus in the definition (8) of the Lebesgue-Stieltjes integral 
with respect to a function of bounded variation ®, the particular representa- 
tion of ® as a difference between two nondecreasing functions does not 
matter, i.e., v need not be the total variation of ® on [a, x]. 
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Problem 3. Let & be the number of spots obtained in throwing an unbiased 
die. Find the mean and variance of &. 


Ans. EE = 1, DE = 38. * 

Problem 4. Find the mean and variance of the random variable & with 
probability density 

p(x) = dg ll (~0 <x < oo). 

Problem 5. Let & be the random variable with probability density 
_ i 
m(1 + x”) 
Prove that E& and Dé fail to exist. 


p(x) = (—0 <x < 0), 


Problem 6. Discuss random variables which are neither discrete nor 
continuous. 


Problem 7. Given a random variable & with distribution function F, 
consider the new random variable y = 9(&), where ¢ is a function summable 
with respect to the Stieltjes measure u, generated by F. Express E& and 
DE in terms of F. 


Hint. Consider the problem of changing variables in a Lebesgue integral. 
Ans. For example, E& = (ee (x) dF(x). 


Problem 8. Prove that if f is continuous on [a, b], then the Riemann- 
Stieltjes integral 


Pf@9 doc) (40) 
does not depend on the values taken by ® at its discontinuity points in (a, b). 


Hint. Use Theorem 3 and formula (13). 


Comment. Hence if f is continuous, we need not insist that ® be con- 
tinuous from the left at its discontinuity points in (a, b). In fact, ® can be 
assigned arbitrary values at these points. 


Problem 9. Write formulas for the Riemann-Stieltjes integral (40) in the 
case where f is continuous and 


a) ® is a jump function; 
b) ® is an absolutely continuous function with a Riemann-integrable 
derivative. 
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Problem 10. Evaluate the following Riemann-Stieltjes integrals: 


0 if ~=—1, 


a) Px ares, where F(x) = 1 if —l<x <2, 
—1 th 2 ee SE 
== if O0< x <i, 

Oo if fax <j, 


b) i. x” dF(x), where F(x) = 


if x = 3, 
=) if $<x<2; 
x if O<x< i, 
c) ih x” dF(x), where F(x) = 1. — 
x = 
f 


Problem 11. Develop a theory o 
whole real line (— 0, 00). 

Problem 12. Extend Theorem 4 to the case where a = —oo or b = © 
(or both), assuming that f(x) approaches a limit as x — -- oo, 

Problem 13. Let {®,} be the same as in Theorem 4, and let {f,} be a 
sequence of continuous functions on [a, b] converging uniformly to a limit f. 
Prove that 


lim f "falx) aD, (x) = [? f(x) a(x). 


Problem 14. Prove that there is a one-to-one correspondence between 
the set of all continuous linear functionals g on C,,,,; and the space V,? ,, 
of Problem 8, p. 332, provided we identify any two elements of V,? ,, which 
coincide at all their continuity points. Prove that the inequality 

Vi) < lel 
need not hold for every ® eV. ,, corresponding to a given functional 
p€C,,.,;, but that there is always at least one such element ® for which 


the inequality holds. 


37. The Spaces L, and L, 


37.1. Definition and basic properties of L,. Let X be a space equipped 
with a measure u, where the measure of X itself may be either finite or 
infinite. Then by L,(X, w), or simply L,,:we mean the set of all real functions 
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f summable on X (however, see Problem 1). Clearly ZL, is a linear space 
(with addition of functions and multiplication of functions by numbers 
defined in the usual way), since a linear combination of summable functions 
is again a summable function. To introduce a norm in Lj, we define 


Lf = | fd dp, (1) 


where, as in the rest of this section, the symbol f by itself denotes integration 
over the whole space X. Of the various properties of a norm (see p. 138), 
it follows at once from (1) that 


IF ll > 9, 
laf ll = lo] fll. 
WA +All < WA + Wal, 

and we need only verify that || || = 0 if and only if f= 0. To insure this, 
we agree to regard equivalent functions (i.e., functions differing only on 
a set of measure zero) as identical elements of the space L,. Thus the 
elements of L, are, to, be perfectly exact, classes of equivalent summable 
functions.“ In particular, the zero element of L, is the class consisting of all 
functions vanishing almost everywhere. With this understanding, we will 


continue to talk (more casually) about “functions in Z,.”’ 
In L,, as in any normed linear space, we can use the formula 


to define a distance. Let {f,} be a sequence of functions in L,. Then {f,} 
is said to converge in the mean to a function fe L, if e(f,, f) + 0 as n— oo. 


THEOREM 1. The space L, is complete. 
Proof. Let {f,} be a Cauchy sequence in Ly, so that 
fm —Snll ~ 0 as m,n— a, 


Then we can find a sequence of indices {m,} (where ny < ng <-++ < 
n, < +++) such that 


1 
Inn —Snveall = [ UC) Smee de <3 (k= 1,2...) 
It follows from the corollary to Levi’s theorem (see p. 307) that the series 


[Fn oT Png =F = ee 


14 Thus the precise definition of addition of two elements 91, 92 € LZ, is the following: 
Let fi and f, be “representatives” of 94 and 92, respectively, i.e., let fi € o1, fo pe. Then 
91 + 92 is the class containing f; + f2 (this class clearly does not depend on the particular 


choice of f, and f.}. 
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converges almost everywhere on X. Therefore the series 


Ifa ae 


also converges almost everywhere on X to some function 
f(x) =lim f,,(*). 
ka 


But {f,,,} converges in the mean to the same function f. In fact, given 
any « > 0, 


| Vfl) — Sul du <e (2) 


for sufficiently large k and /, since {f,,} 1s a Cauchy sequence. Hence, 
by Fatou’s theorem (Theorem 3, p. 307), we can take the limit as / > oo 
behind the integral sign in (2), obtaining 


J Fax) —f@)| da < e. 


It follows that f ¢ L, (why?) and thatf,,, > fin the mean. Butifa Cauchy 
sequence contains a subsequence converging to a limit, then the sequence 
itself must converge to the same limit. Hence f,, > fin the mean. J 


According to the definition of the Lebesgue integral (see p. 296), given 
any function f summable on X and any « > 0, there is a summable simple 
function ~(x) such that 


[f@) - e@)1<e. 


Moreover, the Lebesgue integral of a summable simple function ¢ taking 
values y,, Yo,-.. on Sets Ej, Ey, ... is defined as the sum of the series 


> Ya(En) 


(assumed to converge absolutely). Therefore every summable simple function 
can be represented as the limit in the mean (i.e., as the limit in the sense of 
convergence in the mean) of a sequence of summable simple functions, 
each taking only finitely many values. In fact, given any « > 0, let N be 
such that 


a 


> lal w(En) <e, 
N+1 


and let! 
' Vr if xEfl<k<QN, 
n(x) = 
\0 otherwise. 
15 Note that oy is a finite linear combination of characteristic functions, namely 


Pw(*) = Vike, + °° + YwXey() 
(see footnote 11, p. 349). 
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Then 
| 190) — en(x)! du < S yal wEs) <e. 
n=N+1 


In other words, the set of all simple functions taking only finitely many values 
is everywhere dense in the space Ly. 


THEOREM 2. Let X be a metric space equipped with a measure w such 
that*® 


1) Every open set and every closed set in X is measurable; 
2) Ifa set M < X is measurable, then 


u(M) = inf u(G), (3) 
McG 


where the greatest lower bound is taken over all open sets GC X 
containing M. 


Then the set of all continuous functions on X is everywhere dense in 


LX, ). 


Proof. We need only show that every simple function taking only 
finitely many values is the limit in the mean of a sequence of continuous 
functions. But every simple function taking only finitely many values is 
a finite linear combination of characteristic functions of measurable sets, 
and hence we need only show that every such characteristic function 
x u(x) is the limit in the mean of a sequence of continuous functions. 
If M < Xis measurable, then (3) implies that given any « > 0, there is a 
closed set Fy, and an open set Gy, such that 


Fu © MCGy, uwGy) — v(Fy) <e. (4) 
Now let!’ 
X ~ Gy, Xx 
,(x) = e( M ) 
eo(X — Gy, x) + (Fy, x) 
Then 


0 if xeX — Gy, 
P(x) = | 

] if xe Fy. 
Moreover, 9, iS continuous, since e(Fy,,x) and e(X — Gy, x) are both 
continuous functions, with a nonvanishing sum. But |y,, — ¢,| does not 
exceed 1 on Gy, — Fy, and vanishes outside this set. Using (4), we find that 


| lXa(x) — pe(x)| du<e. Oi 


16 These conditions are satisfied by ordinary Lebesgue measure in n-space, and in 
many other cases of practical interest. 

1” As usual, 0(A, x) denotes the distance between the set A and the point x (see Problem 
9, p. 54). 
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The space L,(X, w) depends on the choice of both the space X and the 
measure p. For example, L,(X, w) is essentially a finite-dimensional space 
if w is concentrated on a finite set of points (why?). In analysis, we are 
mainly interested: in the case where L, is infinite-dimensional but has a 
countable everywhere dense subset.1® To characterize such spaces, we 
introduce the following concept, stemming from general measure theory: 


DEFINITION. Suppose a space X equipped with a measure w has a 
countable system & of measurable subsets A,, Ao, ... such that given any 
¢ > 0 and any measurable subset M < X, there is aset A, € & satisfying 
the inequality 

u(M A A;) <e. 


Then y. is said to have a countable base, consisting of the sets Ay, A»,... 


Example. Let w be a Lebesgue extension of a measure m originally 
defined on a countable semiring %,. Then the ring 2(%,) is obviously 
itself countable, and hence, by Theorem 3, p. 277, is a countable base for up. 
In particular, ordinary Lebesgue measure on the line has a countable base, 
since we can choose the original semiring -%, to consist of all intervals (open, 
closed and half-open) with rational end points. 


THEOREM 3. Let X be a space equipped with a measure w, and suppose 
u has a countable base Ay, Az,... . Then I4(X, %) has a countable 
everywhere dense subset. 


Proof. We will show that the set M of all finite linear combinations 
of the form 


Loh (5 


where f; is the characteristic function of A, and the numbers c,,... , Cp 
are rational, forms a countable everywhere dense subset of L, = L,(X, pw). 
The countability of M is obvious, and we need only show that M is 
everywhere dense in L,. As already noted, the set of all simple functions 
taking only finitely many values is everywhere dense in L,. But every such 
function can be approximated arbitrarily closely by a function of the same 
type taking only rational values. Hence we need only show that every 
function f taking rational values y,,..., Y, On pairwise disjoint sets 
E,,...,£, (with X as their union) can be approximated arbitrarily 
closely in the Z,-metric by functions of the form (5). Clearly, there is 
no loss of generality in assuming that the base 4, A,,.. . is closed under 
the operations of taking differences and forming finite unions and 
intersections (why ?). 


18 So that L, is separable, as defined on p. 48. 
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Now, according to the definition, given any « > 0, there are sets 
A,,...,A,, such that 


u[(E, — A,) VU (A, — £,)] <e (A =1,...,n). 
Let 
Aj, = A, —UA; (Read. oieg ti) 
<k 
and define a function 
Vr if xeA,, 
* oe n 
PO =\9 if xe x —U AL 
k=1 
{x f(x) A f*(x)}, 
and hence the left-hand side of 


[ 1A) —f*001 de < 2 (max ly.eeeef(x) FS*@)}, 


Then clearly 


can be made arbitrarily small by choosing « > 0 sufficiently small. This 
proves the theorem, since f* is a function of the form (5). jj 


37.2. Definition and basic properties of L,. As we have seen, the space 
L, = L,(X, p) is a Banach space, i.e., a complete normed linear space. 
However, L, is not Euclidean, since its norm cannot be derived from any 
scalar product. This follows from the “parallelogram theorem’? (Theorem 
15, p. 160). For example, if X = [0, 2x] and pu is ordinary Lebesgue measure 
on the line, then the condition 


IF+ gl? + IF — gl? = 2q1f 11? + lgll?) 


fails for the summable functions f(x) = 1, g(x) = sin x.!® To get a function 
space which is not only a normed linear space but also a Euclidean space, 
we now consider the set of functions whose squares are summable. 

Thus let X be a space equipped with a measure pz, where we temporarily 
assume that 2(X) < 00. Then by L,(X, p), or simply L,, we mean the set of 
all real functions f whose squares are summable on X, i.e., which satisfy 
the condition 


[ #72) du < © 


(however, see Problem 6). As in the case of Z,, we do not distinguish 
between equivalent functions (i.e., functions differing only on a set of 
measure zero). 


19 As an exercise, show that the same kind of counterexample works quite generally. 
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THEOREM 4, If fand g belong to L,, then so do af, f + g, and fg, where 
a is an arbitrary constant. In particular, L, is a linear space. 


Proof. Obviously af € Le, since 
[laf du = 0] 7°) du < oo, 
The fact that fg € L, follows from the inequality 
IFCO8O)1 < f°) + 8°] (6) 
and Theorem 3, p. 297.” But then f+ g € Ls, since 
[/) + 8O)P < f7(%) + 2 | fdg(*)] + 87), 
where each term on the right is summable. jj 


Next we define a scalar product in Lp, setting 


(f, a) = | fa) do. 


This choice obviously has all the properties of a scalar product listed on 
p. 142: 


1) (4S) > 0 where (f, f) = 0 if and only if f = 0; 
2) (f; g) ma (g, J); 

3) Af, 8) = M48); 

4) (fF, 81 + 82) = (Ss 81) + VS; 82)- 


(In asserting that (7, f) = 0 if and only if f= 0, we rely on the fact that 
every function vanishing almost everywhere is identified with the zero element 
of L,.) Thus L, is a Euclidean space, with the norm defined by the usual 
formula 


Ifl=VGA (7) 
(recall Theorem |, p. 142). In the case of Le, (7) takes the form 


If = | [7?@) du. 


By the same token, the distance between two elements f, g € Le is just 


(8) =If—al = J [UA — OP du. 
The quantity 


[Lf@) — s@)P de = If gl? 


is called the mean square deviation of the functions fand g (from each other). 


20 Setting g(x) = 1 in (6), we find that fe L, implies fe L, (provided that X is of finite 
measure). 
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Let {f,} be a sequence of functions in L,. Then {f,} is said to converge in 
the mean square to a function fé L, if e(f,,, f) > 0 as n—> o. 
In L,, as in any other Euclidean space, we have the Schwarz inequality 


(A 2)! < IS Il lel, 


which here takes the form 


| If (x)g(x) du | < r| If “(x) du | | 9°(x) du. (8) 
The L,-version of the triangle inequality 


I*f+ al < IF + lal 


| [U/C) + 8@P dy < J [7°09 du + | [a°@) dp. 
In particular, replacing f by | f| and setting g(x) = 1 in (8), we get 


[ fel de < Vuk), | [ 700 de, (9) 


from which it is again apparent (cf. footnote 20) that fe L, implies fe L, 
if 2(X) < oo, 


THEOREM 5. The space L, is complete. 


is clearly 


Proof. Let {f,} be a Cauchy sequence in L,, so that 
lfm —Snll 0 as m,n— o. 


Then, by (9), given any « > 0, we have 


J fm) —faDI de < Jah, | [Um) — fxGOP de < eV u(X) 


for sufficiently large m and n, i.e., {/,} 1s also a Cauchy sequence in the 
L,-metric. Repeating the argument given in the proof of the completeness 
of L,, we choose a subsequence {/,,,} from {/,} converging almost 
everywhere to some function f. Clearly, given any « > 0, we have 


[Un) —faCOP dp < (10) 


for sufficiently large k and /. Hence, by Fatou’s theorem (Theorem 3, 
p. 307), we can take the limit as /—> oo behind the integral sign in (10), 
obtaining 


[Utd — f(x)P dp <e. 
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It follows that fe L, (why ?) and that f,, fin the mean square. But if 
a Cauchy sequence contains a subsequence converging to a limit, then the 
sequence itself must converge to the same limit. Hence f, + / in the 
mean square. fj 


We now drop the restriction u(X) < 0, allowing X to have infinite 
measure. In the case u(X) = ©, it is no longer true that fe L, implies 
fe Ly, a fact deduced from (6) or (9) in the case u(X) < 00. For example, 
let X be the real line equipped with ordinary Lebesgue measure, and let 


1 
f{(x) ==... 
V1 + x? 
Then f belongs to L, but not to Ly, since 
[° R=, {° ax ia | ae a © 
—9 /1 4. x? —~ J + x? 


Moreover, if a sequence {/,} converges to a limit f in the L,-metric, it 
follows from (9) that {7/,,} also converges to fin the L,-metric if u(X) < ©. 
However, this conclusion fails if u(X) = 00, as shown by the example 


if |x| <a, 


i io 


Tih) = 


0 if |x| > xn, 
where {f,,} approaches no limit in L, but approaches the zero function in L, 
(give the details). Despite all this, we have?! 


THEOREM 5’. The space Ly is complete even if u(X) = ©, provided 
that wp is o-finite. 


Proof. As in Sec. 30.2, let 
X= U Xn u(X,,) < 0, 


where 
Aye Xe Crs CX, Cee 
Moreover, given any function ¢ on X, let 
o(x) if xEXx,, 


g(x) = 
0 if x€éX,, 


*! Note that in the proof of the completeness of L, (Theorem 1), X¥ can have either 
finite or infinite measure. 
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so that 
=: = fj ' aa (n) 
fo) du = Jo) du lim |. o(x) du =tim [9 du, 


if ~ is summable on X. Let {/,} be a Cauchy sequence in L,, so that, 
given any « > 0, 


[ULC) — ACO? dp < 
for all sufficiently large k and /, Then 
tim | UP) — SCOP du = [KC) — AOI de <e, 


and hence, a fortiori, 
J UP) — SPOOR de <e. (11) 


But L,(X,,, w) is complete, by Theorem 5, since u(X,,) < 0. Therefore 
{7} converges in the metric of L,(X,,, ») to a function f( € L,(X,, w). 
Taking the limit as /—» oo behind the integral sign in (11), we get 


J, Lf") — SPOOR de <e (12) 


(why is this justified ?). Since (12) holds for every n, we can now take 
the limit as n — oo, obtaining 


Him | LAP) — SPOOF dy < . (13) 
Now let ot 
Sx) =f™ (x) if xeX,,. 
Then (13) implies 
[ULO) — FOO} du <e. 
It follows that fe L,(X, w) and f, > fin the mean square. fj 


Problem 1. A complex function is said to be summable if its real and 
imaginary parts are summable. Show that the considerations of Sec. 37.1 
carry over verbatim to the case where L, consists of all complex summable 
functions (defined on X). 


Problem 2. Prove that if each of the measures p., and pu». has a countable 
base, then so does their direct product u = uy X pe. 


Comment. In particular, Lebesgue measure in the plane (or more 
generally in n-space) has a countable base. 
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Problem 3. Let X be the interval [a, b], and let w be ordinary Lebesgue 
measure on the line. Prove that the set F of all polynomials on [a, 5] with 
rational coefficients is everywhere dense in L,(X, py). 


Hint. Use Theorem 2 and the fact that every function continuous on 
[a, b] can be approximated in the mean (or even uniformly) by elements of 7. 


Problem 4. Prove that L,(X, w) is separable, i.e., has a countable every- 
where dense subset, if » has a countable base. 


Comment. Thus L,(X, up) is a Hilbert space if w has a countable base 
(we disregard the case where L,(X, uw) is finite-dimensional). It follows from 
Theorem 11, p. 155 that all such spaces are isomorphic, in particular, that 


L,(X, w) is isomorphic to the space /, of all sequences (x,, X2,...5Xn,--+ +) 
such that 

fee) 

> x, OO: 

n=1 


(in fact, /, corresponds to the case where the measure pu is concentrated on a 
countable set of points). 


Problem 5. Prove that every continuous linear functional » on L,(X, p), 
where yu has a countable base, can be represented in the form 


of) = | fg) dy, 
where g is a fixed element of L,(X, p). 
Hint. Recall Theorem 2, p. 188. 


Problem 6. Show that the considerations of Sec. 37.2 carry over verbatim 
to the case where L, consists of all complex functions f satisfying the condition 


| LCP dp < &, 


provided the scalar product of two such functions f and g is now defined as 


(f. 2) = | fa) du. 


Show that the resulting space L, is a complex Hilbert space if the measure u 
has a countable base (again disregard the finite-dimensional case). 


Problem 7. Let {f,} be a sequence of functions defined on a space X 
equipped with a measure pu such that u(X) < oo. Prove that 


a) If {f,} converges uniformly, then {/,} converges in the mean and in 
the mean square; 

b) If {7} converges in the mean or in the mean square, then {/,,} con- 
verges in measure (as defined in Problem 6, p. 292); 
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c) If {f,} converges in the mean or in the mean square, then {f,,} contains 
a subsequence {/f,,,} which converges almost everywhere. 


Hint. See Problem 9, p. 292. Alternatively, recall the proof of Theorem 1. 


Problem 8. Prove that the sequence of functions constructed in Problem 
8, p. 292 converges to f(x) = 0 in the mean and in the mean square, without 
converging at a single point. 


Problem 9. Give an example of a sequence of functions {/,,} which con- 
verges everywhere on [0, 1], but does not converge in the mean. 


Hint. Let 
n if x e€(0, I/n), 
Srlx) = 2 


otherwise. 


Problem 10. Give an example of a sequence of functions {f,} which 
converges uniformly, but does not converge in the mean or in the mean 
square. 


Hint. According to Problem 7a, we must have p(X) = oo. Let 


1 ; 
—= if |x| <a, 


fal) = {a/n 


0 if |x| > a7. 


Problem 11. Show that convergence in the mean need not imply con- 
vergence in the mean square, whether or not u(X) < 0. 


Problem 12. Let L,(X, uw) be the set of all classes of equivalent (real or 
complex) functions f such that 


[isPdp<o (L<p<o), 


1/p 
ue ( fue du) 
Prove that L,(X, w) is a Banach space. 


equipped with the norm 


BIBLIOGRAPHY 


Akhiezer, N. I. and I. M. Glazman, Theory of Linear Operators in Hilbert Space 
(translated by M. Nestell), Frederick Ungar Publishing Co., New York, Volume I 
(1961), Volume IT (1963). 

Berberian, S. K., Measure and Integration, The Macmillan Co., New York (1965). 

Burkill, J. C., The Lebesgue Integral, Cambridge University Press, New York 
(1953). 

Day, M. M., Normed Linear Spaces, Springer-Verlag, New York (1963). 

Dieudonné, J , Foundations of Modern Analysis, Academic Press, Inc., New York 
(1960). 

Dunford, N. and J. T Schwartz, Linear Operators, Interscience Publishers, Inc., 
New York, Part I: General Theory (1958), Part II: Spectral Theory (1963). 

Edwards, R. E., Functional Analysis, Theory and Applications, Holt, Rinehart 
and Winston, New York (1965). 

Fraenkel, A. A., Abstract Set Theory, third edition, North-Holland Publishing 
Co., Amsterdam (1966). 

Friedman, A A., Generalized Functions and Partial Differential Equations, Prentice- 
Hall, Inc., Englewood Chiffs, N.J. (1963). 

Gelfand, I. M. and G. E Shilov, Generalized Functions (translated by E. Saletan 
et al.), Academic Press, Inc., New York, Volume 1: Properties and Operations 
(1964), Volume 2: Spaces of Fundamental and Generalized Functions (1968). 

Hahn, H. and A. Rosenthal, Set Functions, University of New Mexico Press, 
Albuquerque, New Mexico (1948). 

Halmos, P. R., Measure Theory, D. Van Nostrand Co., Inc., Princeton, N.J. (1950). 

Halmos, P. R., Introduction to Hilbert Space, second edition, Chelsea Publishing 
Co., New York (1957). 


39| 


BIBLIOGRAPHY 392 


Halmos, P. R., Naive Set Theory, D. Van Nostrand Co., Inc., Princeton, N.J. 
(1960). 

Halmos, P. R., A Hilbert Space Problem Book, D. Van Nostrand Co., Inc., 
Princeton, N.J. (1967). 

Hewitt, E. and K. Stromberg, Real and Abstract Analysis, Springer-Verlag, New 
York (1965). 

Hildebrandt, T. H., Introduction to the Theory of Integration, Academic Press, Inc., 
New York (1963). 

Kelley, J. L., General Topology, D. Van Nostrand Co., Inc., Princeton, N.J. (1955). 

Kelley, J. L., I. Namioka et al., Linear Topological Spaces, D. Van Nostrand Co., 
Princeton, N.J. (1963). 

Kuratowski, K., Introduction to Set Theory and Topology (translated by L. F. 
Boron), Pergamon Press, Inc., New York (1961). 

Liusternik, L. A. and V. I. Sobolev, Elements of Functional Analysis (translated by 
A.E. Labarre, Jr. et al.), Frederick Ungar Publishing Co., Inc., New York (1961). 

Loéve, M., Probability Theory, third edition, D. Van Nostrand Co., Inc., Princeton, 
N.J. (1963). 

McShane, E. J., Integration, Princeton University Press, Princeton, N.J. (1944). 

Natanson, I. P., Theory of Functions of a Real Variable (translated by L. F. Boron, 
with the collaboration of E. Hewitt), Frederick Ungar Publishing Co., New York, 
Volume I (1955), Volume II (1960). 

Riesz, F. and B. Sz.-Nagy, Functional Analysis (translated by L. F. Boron), 
Frederick Ungar Publishing Co., New York (1955). 

Royden, H. L., Real Analysis, second edition, The Macmillan Co., New York 
(1968). 

Rudin, W., Principles of Mathematical Analysis, second edition, McGraw-Hill 
Book Co., Inc., New York (1964). 

Saks, S., Theory of the Integral (translated by L. C. Young, with two notes by S. 
Banach), second edition, Dover Publications, Inc., New York (1964). 

Schaefer, H. H., Topological Vector Spaces, The Macmillan Co., New York (1966): 

Shilov, G. E., Generalized Functions and Partial Differential Equations (translated 
by B. D. Seckler), Gordon and Breach Science Publishers, Inc., New York (1968). 

Shilov, G. E. and B. L. Gurevich, Integral, Measure and Derivative: A Unified 
Approach (translated by R. A. Silverman), Prentice-Hall, Inc., Englewood Clifis, 
N.J. (1966). 

Taylor, A. E., Introduction to Functional Analysis, J. Wiley and Sons, Inc., New 
York (1958). 

Titchmarsh, E. C., The Theory of Functions, second edition, Oxford University 
Press, New York (1939). 

Yosida, K., Functional Analysis, second edition, Springer-Verlag, New York (1968). 

Zaanen, A. C., An Introduction to the Theory of Integration, Interscience Publishers, 
Inc., New York (1958). 


A 


Absolutely continuous charge, 347 
Absolutely continuous function, 336 
Absolutely summable sequence, 185 
Adjoint operator, 232 
in Hilbert space, 234 
Aleph null, 16 
Alexandroff, P. S., 90, 97 
Algebra of sets, 31 
Algebraic dimension, 128 
Algebraic number, 19 
Almost everywhere, 288 
Angle between vectors, 143 
Arzela’s theorem, 102 
generalization of, 107 
Axiom of choice, 27 
Axiom of countability: 
first, 93 
second, 82 
Axiom of separation: 
first, 85 
Hausdorff, 85 
second, 85 


Baire’s theorem, 61 
B-algebra (see Borel algebra) 
Banach, S., 138, 229, 238 
Banach space, 140 
Base, 81 

countable, 382 

neighborhood (local), 83 
Basis, 121 

dual, 185 
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Basis (cont.): 
Hamel, 128 
orthogonal, 143 
orthonormal, 143 
Bessel’s inequality, 150, 165 
Bicompactum, 96 
Binary relation (see Relation) 
Birkhoff, G., 28 
Bolzano-Weierstrass theorem, 101 
Borel algebra, 35 
irreducible, 36 
minimal, 36 
Borel closure, 36 
Borel sets, 36 
Bounded linear functional, 177 
norm of, 177 
Bounded real function, 110 
Bounded set, 65, 141, 169 
locally, 169 
strongly, 197 
weakly, 197 
B-set (see Borel set) 
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Cantor, G., 29 
Cantor function, 335 
Cantor set, 52 
points of the first kind of, 53 
points of the second kind of, 53 
Cantor-Bernstein theorem, 17 
Cardinal number, 24 


Cartesian product (see Direct product) 


Cauchy criterion, 56 
Cauchy sequence, 56 
Cauchy-Schwarz inequality, 38 
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Chain, 28 
maximal, 28 
Characteristic function, 349 
Charge, 344 
absolutely continuous, 347 
concentrated, on a set, 346 
continuous, 346 
density of, 350 
discrete, 347 
negative, 344 
negative variation of, 346 
positive, 344 
positive variation of, 346 
Radon-Nikodym derivative of, 350 
singular, 347 
total variation of, 346 
Chebyshev’s inequality, 299 
Choice function, 27 
Classes, 6 
equivalence, 8 
Closed ball (see Closed sphere) 
Closed graph theorem, 238 
Closed set(s), 49 
in a topological space, 79 
on the real line, 51 
unions and intersections of, 49 
Closed sphere(s), 46 
center of, 46 
nested (or decreasing) sequence of, 
59 
radius of, 46 
Closure, 46, 79 
Closure operator, 46 
properties of, 46 
Codimension, 122 
Cohen, P. J., 29 
Compact space, 92 
countably, 95 
locally, 97 
Compactness, 92 
countable, 95 
relative, 97 
relative countable, 97 
Compactum, 92, 96 
metric, 96 
Complement of a set, 3 
Complete limit point, 97 
Complete measure, 280 
Completely continuous operator(s), 239 ff. 
basic properties of, 243-246 
in Hilbert space, 246-251 
Completely regular space, 92 


Completion (of a metric space), 62 
Component (of an open set), 55 
Conjugate space, 185 
of a normed linear space, 184 
second, 190 
strong topology in, 190 
third, 190 
weak topology in, 200 
weak* topology in, 202 
Connected set, 55 
Connected space, 84 
Contact point, 46, 79 
Continuity, 44, 87 
from the left, 315 
from the right, 315 
uniform, 109 
Continuous charge, 346 
Continuous linear functional(s), 175 ff. 
order of, 182 
sufficiently many, 181 
Continuum, 16 
power of, 16 
Contraction mapping(s), 66 ff. 
and differential equations, 71-72 
and integral equations, 74-76 
and systems of differential equations, 
72-74 
principle of, 66 
Convergence almost everywhere, 289 
Convergence in measure, 292 
Convergence in the mean, 379 
Convergence in the mean square, 385 
Convergent sequence: 
in a metric space, 47 
in a topological space, 84 
Convex body, 129 
Convex functional, 130, 134 
Convex hull, 130 
Convex set, 129 
Convexity, 128 
Countability of rational numbers, 11 
Countable additivity, 266, 272 
Countable base, 382 
Countable set, 10 
Countably compact space, 95 
Countably Hilbert space, 173 
Countably normed (linear) space, 171 
complete, 173 
Cover, 83 
closed, 83 
open, 83 
Covering (see Cover) 


Curve(s): 
in a metric space, 112-113 
length of, 114, 115 
sequence of, 115 
rectifiable, 332 


D 


Decomposition of a set into classes, 6-9 
-algebra, 35 
d-ring, 35 
Delta function, 124, 208 
Dense set, 48 
everywhere, 48 
nowhere, 48, 61 
Density, 350 
Derived numbers, 318 
left-hand lower, 318 
right-hand upper, 318 
Diameter of a set, 65 
Difference between sets, 3 
Differentiation: 
of a monotonic function, 318-323 
of an integral with respect to its upper 
limit, 323-326 
Dimension, 121 
algebraic, 128 
Dini’s theorem, 115 
Direct product, 238, 352 
of measures, 354 
Directed set, 29 
Dirichlet function, 289, 291, 301 
Discontinuity point of the first kind, 315 
Discrete charge, 347 
Discrete space, 38 
Disjoint sets, 2 
pairwise, 2 
Distance: 
between a point and a set, 54 
between two sets, 55 
properties of, 37 
symmetry of, 37 
Domain (of definition), 4, 5, 221 
Domain (open connected set), 71 


E 


Egorov’s theorem, 290 
Figenvalue, 235 
Figenvector, 235 
Elementary set, 255 
measure of, 256 
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Empty set, 2 
e-neighborhood, 46 
e-net, 98 
Equicontinuous family of functions, 102 
Equivalence classes, 8 
Equivalence relation, 7 
Equivalent functions, 288 
Equivalent sets, 13 
Essential] supremum, 311 
Essentially bounded function, 310 
Euclidean n-space, 38, 144 
Euclidean space(s), 138, 142 ff. 
characterization of, 160 
complete, 153 
norm of vector in, 164 
orthogonal elements of, 164 
components of elements of, 149 
norm in, 142 
separable, 146 
Euler lines, 105 
Exhaustive sequence of sets, 308 
Extension of a functional, 132 
Extension of a measure, 271, 277, 279 
Jordan, 281 


F 


Factor space, 122 
Fatou’s theorem, 307 
Field, 37 
Finite expansion, 33 
Finite function, 208 
Finite set, 10 
First axiom of countability, 83 
First axiom of separation, 85 
Fixed point, 66 
Fixed point theorem, 66 
Fourier coefficients, 149, 152, 165 
Fourier series, 149, 165 
Fractional part, 8 
Fraenkel, A. A., 25, 27 
Fredholm equation, 74 
homogeneous, 74 
kernel of, 74 
nonhomogeneous, 74 
Friedman, A., 212 
Fubini’s theorem, 359 
Function space, 39, 108 
Functional(s), 108, 123 
addition of, 183 
additive, 123 
bounded linear 
functional) 


(see Bounded linear 
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Functional(s) (cont.): 
conjugate-homogeneous, 123 
conjugate-linear, 124 
continuous, 175 
continuous linear (see Continuous linear 

functionals) 
convex, 130, 134 
extension of, 132 
homogeneous, 123 
linear, 124, 175 ff. 
Minkowski, 131 
null space of, 125 
product of, with a number, 183 
separation of sets by, 136 

Function(s), 4 ff. 
absolutely continuous, 336 
Borel-measurable, 284 
bounded (real), 110, 207 
Cantor, 335 
characteristic, 349 
continuous, 44, 79 

from the left, 315 
from the right, 315 
uniformly, 109 
delta, 124, 208 
domain (of definition of), 4, 5 
equivalent, 288 
essentially bounded, 310 
finite, 207 
general, 5 
generalized (see Generalized functions) 
generating, 362 
infinitely differentiable, 169 
integrable, 294, 296, 308 
locally, 208 
inverse, 5 
jump, 315, 341 
jump of, 315 
left-hand limit of, 315 
lower limit of, 111 
lower semicontinuous, 110 
measurable, 284 ff. 
monotonic, 314 
nondecreasing, 314 
nonincreasing, 314 
of bounded variation, 328-332 
one-to-one, 5 
oscillation of, 111 
range of, 4, 5 
real, 4 
right-hand limit of, 315 
simple, 286 


Function(s) (cont.): 

singular, 341 

step, 316 

summable, 294, 296, 308 

test, 208 

uniformly continuous, 109 

upper limit of, 111 

upper semicontinuous, 110 
Fundamental functions (see Test functions) 
Fundamental parallelepiped, 98 
Fundamental sequence (see Cauchy se- 

quence) 

Fundamental space (see Test space) 


G 


General measure theory, 269 ff. 
Generalized function(s), 124, 206 ff. 
and differential equations, 211-214 
complex, 215 
convergence of, 209 
definition of, 208 
derivative of, 210 
of several variables, 214-215 
on the circle, 216 
operations on, 209-210 
product of, with a number, 209 
product of, with an infinitely differenti- 
able function, 210 
regular, 208 
singular, 208 
sum of, 209 
Gédel, K., 209 
Graph, 238 
Greatest lower bound (in a partially ordered 
set), 30 
Gurevich, B. L., 350, 351 


H 


Hahn decomposition, 345 
Hahn-Banach theorem, 132, 180 
complex version of, 134, 181 

Hamel basis, 128 

Hausdorff axiom of separation, 85 
Hausdorff space, 85 

Hausdorff’s maximal principle, 28 
Heine-Borel theorem, 92 

Helly’s convergence theorem, 370 
Helly’s selection principle, 372 
Hereditary property, 87 

Hilbert, D., 155 


Hilbert cube, 98 
Hilbert space(s), 155 ff. 
complex, 165 
countably, 173 
isomorphic, 155, 165 
linear manifold in, 156 
closed, 156 
subspace(s) of, 156 
direct sum of orthogonal, 159 
(mutually) orthogonal, 158 
orthogonal complement of, 157 
Hilbert-Schmidt theorem, 248 
HOlder’s inequality, 41 
homogeneity of, 42 
H6lder’s integral inequality, 45 
Homeomorphic mapping, 44, 89 
Homeomorphic spaces, 44, 89 
Homeomorphism, 44, 89 
Hyperplane, 127 


Ideal, two-sided, 252 
Image: 
of an element, 5 
ofa set, 5 
Infimum, 51 
Infinite set, 10 
Initial section, 25 
Inner measure, 258, 276 
Integrable function, 294, 296, 308 
Integral part, 8 
Interior, 128 
Interior point, 50 
Intersection of sets, 2 
Into mapping, 5 
Invariant subspace, 238 
Inverse function, 5 
Invisible point: 
from the left, 319 
from the right, 319 
Isolated point, 47 
Isometry, 44 
Isomorphism, 21, 120, 155, 165 
conjugate-linear, 194, 234 
Isomorphism theorem, 155, 165 


J 


Jordan decomposition, 346 
Jordan extension, 281 
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Jordan measurable set, 281 
Jordan measure, 281 
Jump, 315 

Jump function, 315, 341 


K 


Kelley, J. L., 87, 90, 92, 97 
Kernel, 74 


L 


Lattice, 30 
Least upper bound (in a partially ordered 
set), 30 
Lebesgue decomposition, 341, 351, 363 
Lebesgue extension, 277, 279 
Lebesgue integral, 293 ff. 
absolute continuity of, 300-301 
as a set function, 343-351 
indefinite, 313 ff. 
of a general measurable function, 296, 
308 
of a simple function, 294 
over a set of infinite measure, 308 
vs. Riemann integral, 293-294, 309-310 
Lebesgue-integrable function (see Inte- 
grable function) 
Lebesgue-Stieltjes integral, 364 
vs. Riemann-Stieltjes integral, 368 
Lebesgue’s bounded convergence theorem, 
303 
Lebesgue’s theorem: 
on differentiation of a monotonic func- 
tion, 321 
on integration of the derivative of an 
absolutely continuous function, 340 
Left-hand limit, 315 
Levi’s theorem, 305 
Limit of a sequence: 
in a metric space, 47 
in a topological space, 84 
Limit point, 47, 79 
complete, 97 
Linear closure, 140 
Linear combination, 120 
Linear dependence, 120 
Linear functional, 175 ff. 
bounded (see Bounded linear func- 
tional) 
continuous (see Continuous linear func- 
tionals) 
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Linear hull, 122 
Linear independence, 121 
Linear manifold, 140, 156 
Linear operator, 221 
bounded, 223 
norm of, 224 
spectral radius of, 239 
closed, 237 
completely continuous (see Completely 
continuous operators) 
graph of, 238 
Linear space(s), 118 ff. 
basis in, 121 
Hamel, 128 
closed segment in, 128 
complex, 119 
countably normed, 171 
dimension of, 121 
algebraic, 128 
finite-dimensional, 121 
functionals on (see Functionals) 
infinite-dimensional, 121 
isomorphic, 120 
linearly dependent elements of, 120 
linearly independent elements of, 121 
n-dimensional, 121 
normed (see Normed linear spaces) 
open segment in, 128 
real, 119 
subspace, 121 
proper, 121 
topological (see Topological linear space) 
Linearly ordered set (see Ordered set) 
Lipschitz condition, 55 
Locally integrable function, 208 
Lower limit, 111 
Lower semicontinuous function, 110 
Luzin’s theorem, 293 


M 


Mapping, 5 ff. 

continuous, 44, 87 

contraction, 66 

fixed point of, 66 

into, 5 

natural, 191 

one-to-one, 5 

onto, 5 

order-preserving, 21 
Mathematical expectation, 366 
Mathematical induction, 28 


Mean square deviation, 384 
Mean (value), 366 
Measurable function, 284 ff. 
integration of, 294, 296, 308 
Measurable set(s), 259 ff, 267 
decreasing sequence of, 266 
increasing sequence of, 267 
Jordan, 281 
Measure(s), 254 ff. 
additivity of, 255, 263 
complete, 280 
continuity of, 267 
countably (c-) additive, 266, 272 
direct product of, 354 
extension(s) of, 271, 275-283 
inner, 258, 276 
Jordan, 281 
Lebesgue, 259, 276, 279 
of an elementary set, 256 
of a plane set, 259, 276 
of a rectangle, 255 
on a semiring, 270 
outer, 258, 276 
product, 354 
o-finite, 308 
signed, 344 
Stieltjes (see Stieltjes measure) 
with a countable base, 382 
Measure space, 294 
Method of successive approximations, 66, 
67 
Metric (see Distance) 
Metric space(s), 37 ff. 
complete, 56 
completion of, 62 
continuous curves in, 112~113 
length of, 114, 115 
sequence of, 115 
continuous mapping of, 44 
convergence in, 47 
incomplete, 56 
isometric, 44 
isometric mapping of, 44 
real functions on, 108 
equivalent continuous, 113 
uniformly continuous, 109 
relatively compact subsets of, 101 
separable, 48 
subspace of, 43 
total boundedness of, 97-99 
compactness and, 99-101 
Metrizable space, 90 


Minkowski functional, 131 
Minkowski’s inequality, 41 
Minkowski’s integral inequality, 45 
Monotonic function, 314 


N 


n-dimensional simplex, 137 
k-dimensional face of, 137 
vertices of, 137 

n-dimensional (vector) space, 119 

Negative set, 344 

Neighborhood, 46, 79 

Neighborhood base, 83 
at zero, 168 

Nested sphere theorem, 60 

Noncomparable elements, 21 

Nondecreasing function, 314 

Nonincreasing function, 314 

Nonmeasurable set, 268 

Normal space, 86 

Normed linear space(s), 138 
bounded subset of, 141 
complete, 140 
complete set in, 140 
conjugate space of, 184 
direct product of, 238 
subspaces of, 140 

Norm(s), 138, 142, 163 
compatible, 171 
comparable, 172 
equivalent, 141, 172 
of a bounded linear functional, 177 
of a bounded linear operator, 224 
properties of, 138 
stronger, 172 
weaker, 172 

n-space, 119 

Null space, 125 


O 


One-to-one correspondence, 5, 10, 13 
One-to-one function, 5 
Onto mapping, 5 
Open ball (see Open sphere) 
Open set(s), 50 
component of, 55 
in a topological space, 78 
on the real line, 51 
unions and intersections of, 50 
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Open sphere, 45 
center of, 46 
radius of, 46 
Operator(s), 221 ff. 
adjoint, 232 
in Hilbert space, 234 
continuous, 221 
degenerate, 240 
domain (of definition) of, 221 
eigenvalue of, 235 
eigenvector of, 235 
identity (or unit), 222 
inverse, 228 
invertible, 228 
linear (see Linear operator) 
product of, 225 
with a number, 225 
projection, 223 
resolvent of, 236 
self-adjoint, 235 
spectrum of, 235 
sum of, 225 
zero, 222 
Order type (see Type) 
Ordered product, 23 
Ordered set, 21 
Ordered sum, 22 
Order-preserving mapping, 21 
Ordinal, 24 
transfinite, 24 
Ordinal number(s), 24 
comparison of, 25 
Orthogonal basis, 143 
Orthogonal complement, 157 
Orthogonal system, 143 
complete, 143 
Orthogonal vectors, 143 
Orthogonalization, 148 
Orthogonalization theorem, 147 
Orthonorma! basis, 143 
Orthonormal system, 143 
closed, 151 
complete, 143 
vs. closed, 151 
Oscillation, 111 
Oyter measure, 258, 276 


P 


Parseval’s theorem, 151 
Partial ordering, 20 
Partially ordered set(s), 20 
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Partially ordered set(s) (cont.): Relation (cont.): 

isomorphic, 21 reflexive, 7 

maximal element of, 21 symmetric, 7 

minimal element of, 21 transitive, 7 

noncomparable elements of, 21 Relatively compact subset, 97 
Partition of a set into classes, 6~9 Relatively countably compact subset, 97 
Peano’s theorem, 104 Residue class, 122 
Petrovski, I. G., 76 Resolvent, 236€ 
Picard’s theorem, 71 Riemann integral, 293 
Polygonal line, 55 vs. Lebesgue integral, 293-294, 309-310 
Positive set, 344 Riemann-Stieltjes integral, 367 
Power: vs. Lebesgue-Stieltjes integral, 368 

of a set, 16 Riesz lemma, 319 

of the continuum, 16 Riesz representation theorem, 374 
Preimage: Riesz-Fischer theorem, 153 

of a set, 5 Right-hand limit, 315 

of an element, 5 Ring of sets, 31 
Principle of contraction mapping, 66 minimal, generated by a semiring, 34 
Probability density, 367 minimal, generated by a system of sets, 32 
Product measure, 354, 356 Rozanov, Y. A., 366 

evaluation of, 356-359 
Projection operator, 223 S 


Proper subspace, 121 
Scalar product, 142 


Q complex, 163 
Schwartz, L., 212 
Quotient space (see Factor space) Schwarz’s inequality, 40, 142 
Second axiom of countability, 82 
R Second axiom of separation, 85 
Self-adjoint operator, 235 
Radon-Nikodym derivative, 350 Semireflexive space, 191 
Radon-Nikodym theorem, 347 Semiring of sets, 32 
Random variable, 366 finite expansion in, 33 
continuous, 366 minimal ring generated by, 34 
discrete, 366 Separable (metric) space, 48 
mathematical expectation of, 366 Set of o-uniqueness, 282 
mean (value) of, 366 Set of uniqueness, 282 
probability density of, 367 Set theory, 1-36 
variance of, 366 naive vs. axiomatic, 29 
Range. 4, 5 Set(s), 1 ff. 
Rectangle, 255 algebra of, 31 
closed, 255 bounded, 65, 141 
half-open, 255 totally, 98 
measure of, 255 Cantor, 52 
open, 255 closed, 49 
Rectifiable curve, 332 closure of, 46 
Reflexive space, 191 complement of, 3 
Reflexivity, 7 connected, 55 
Relation, 7 contact point of, 46 
antisymmetric, 7 convex, 129 
binary, 7 countable, 10 


equivalence, 7 curly bracket notation for, 1 


Set(s) (cont.): 
decomposition of, 6 
dense, 48 

everywhere, 48 
nowhere, 48, 61 
diameter of, 65 
difference between, 3 
direct product of, 352 
directed, 29 
disjoint, 2 
pairwise, 2 
duality principle for, 4 
elementary, 255 
elements of, 1 
empty, 2 
equivalent, 13 
exhaustive sequence of, 308 
finite, 10 
infinite, 10 
interior of, 128 
interior point of, 50 
intersection of, 2 
isolated point of, 47 
Jordan measurable, 281 
(Lebesgue) measurable, 259, 267, 276, 
279 
limit point of, 47 
_ complete, 97 
measure of, 259, 267, 276, 279 
negative, 344 
nonmeasurable, 268 
of uniqueness, 282 
of c-uniqueness, 282 
open, 50 
operations on, 2 ff. 
ordered, 21 
partially ordered, 20 
partition of, 6 
positive, 344 
power of, 16 
ring of, 31 
semiring of, 32 
subset of, 1 
proper, 2 
sum of, 2 
symmetric, 171 
symmetric difference of 3, 4 
systems of, 31-36 
totally bounded, 98 
uncountable, 10 
union of, 2 
well-ordered, 23 
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Shilov, G. E., 147, 155, 245, 350, 351 
o-additivity (see Countable additivity) 
o-algebra, 35 

c-finite measure, 308 

o-ring, 35 

Signed measure, 344 

Silverman, R. A., 76, 140, 147, 247, 350, 

366 

Simple function, 286 

Simplex (see n-dimensional simplex) 

Simply ordered set (see Ordered set) 

Singular charge, 347 

Singular function, 341 

Smirnov, V. I, 247 

Space: 

c, 120 

Co, 120 

Chav], 39, 57 

Cfa,»}+ 40, 59 

C*, 119 

CU, R), 113 

of isolated points, 38, 56 
of rapidly decreasing sequences, 172 
f,. 39, 57 

ln, 43 

L,, 378 

Lz, 383 

m, 41, 120 

R’, 38, 56 

R”, 38, 57 

R®, 120 

Rj, 41 

Spectral radius, 239 

Spectrum, 235 
continuous, 236 
point, 236 
regular point of, 235 

Step function, 211, 316 

Stereographic projection, 14 

Stieltjes integral (see Lebesgue-Stieltjes 

integral) 

Stieltjes measure, 362, 364 
absolutely continuous, 363 
discrete, 363 
generating function of, 362 
singular, 363 

Strong convergence, 195 

Strong topology, 184 
in conjugate space, 190 

Subcover, 83 

Subset, 1 
proper, 2 
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Subspace, 121 
closed, 140 
generated by a set, 122 
invariant, 238 
proper, 121 
Successive approximations, method of, 66, 
67 
Sum of sets, 2 
Summable function, 294, 296, 308 
complex, 387 
Supremum, 41, 51 
Symmetric difference, 3, 4 
Symmetric set, 171 
Symmetry, 7 
System of sets, 31 
centered, 92 
trace of, 80 
unit of, 31 


T 


Test functions, 208 
convergence of, 208 
Test space, 208, 216 
Tolstov, G. P., 140, 145 
Topological linear space, 138, 167 ff. 
bounded subset of, 169 
continuous mapping of, 87 
functionals on, 175 
continuous, 175 
continuous linear, 175 ff. 
linear, 175 
locally bounded, 169 
locally convex, 169 
neighborhood base at zero of, 168 
normable, 169 
weak topology in, 195 
Topological space(s), 78 ff. 
base for, 81 
bicompact, 96 
closed sets of, 79 
compact, 92 
completely regular, 92 
connected, 84 
convergence in, 84 
countably compact, 95 
cover (covering) of, 83 
hereditary property of, 87 
locally compact, 97 
metrizable, 90 
normal, 86 
open sets of, 78 


Topological space(s) (cont.): 
points of, 79 
real functions on, 108 
relatively compact subset of, 97 
relatively countable compact subset of, 97 
with a countable base, 82 
Topology, 78 
generated by a system of sets, 80 
relative, 80 
strong, 184, 190 
stronger, 80 
weak, 195, 200 
weak*, 202 
weaker, 80 
Total variation, 328, 346 
Totally bounded set, 98 
Transcendental number, 19 
Transfinite induction, 29 
Transfinite ordinal, 24 
Transitivity, 7 
Triangle inequality, 37, 138 
T,-space, 85 
T,-space, 85 
Two-sided ideal, 252 
Tychonoff space, 92 
Type(s), 22 
ordered sum of, 23 
ordered product of, 23 
vs. power, 22 


U 


Uncountability of real numbers, 15 
Uncountable set, 10 

Uniform continuity, 109 

Uniformly bounded family of functions, 102 
Union of sets, 2 

Unit (of a system of sets), 31 

Upper bound (in a partially ordered set), 28 
Upper limit, 111 

Upper semicontinuous function, 110 
Urysohn’s lemma, 91 

Urysohn’s metrization theorem, 90 


V 


van der Waerden, B. L., 327 
Variance, 366 
Variation: 

bounded, 328 

negative, 346 

positive, 346 

total, 328, 346 


INDEx 403 


Vector space (see Linear space) Well-ordered set, 23 
Volterra equation, 75 (initial) section of, 25 
Volterra operator, 243 order type of, 24 
remainder of, 25 
W smallest element of, 23 


Well-ordering theorem, 27 


Weak convergence, 195 

of functionals, 200 
Weak* convergence, 202 Z 
Weak topology, 195 

in conjugate space, 200 
Weak* topology, 202 Zermelo, E., 27 
Weierstrass’ approximation theorem, 140, Zero element, 118 

145 Zorn’s lemma, 28 
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